Module 2: "Parallel Computer Architecture: Today and Tomorrow"
  Lecture 4: "Shared Memory Multiprocessors"
 

On-chip TLP

  • Current trend:
    • Tight integration
    • Minimize communication latency (data communication is the bottleneck)
  • Since we have transistors
    • Put multiple cores on chip (Chip multiprocessing)
    • They can communicate via either a shared bus or switch-based fabric on-chip (can be custom designed and clocked faster)
    • Or put support for multiple threads without replicating cores (Simultaneous multi-threading)
    • Both choices provide a good cost/performance trade-off

Economics

  • Ultimately who controls what gets built?
  • It is cost vs. performance trade-off
  • Given a time budget (to market) and a revenue projection, how much performance can be afforded
  • Normal trend is to use commodity microprocessors as building blocks unless there is a very good reason
    • Reuse existing technology as much as possible
  • Large-scale scientific computing mostly exploits message-passing machines (easy to build, less costly); even google uses same kind of architecture [use commodity parts]
  • Small to medium-scale shared memory multiprocessors are needed in the commercial market (databases)
  • Although large-scale DSMs (256 or 512 nodes) are built by SGI, demand is less

Summary

  • Parallel architectures will be ubiquitous soon
    • Even on desktop (already we have SMT/HT, multi-core)
    • Economically attractive: can build with COTS (commodity-off-the-shelf) parts
    • Enormous application demand (scientific as well as commercial)
    • More attractive today with positive technology and architectural trends
    • Wide range of parallel architectures: SMP servers, DSMs, large clusters, CMP, SMT, CMT, …
    • Today’s microprocessors are, in fact, complex parallel machines trying to extract ILP as well as TLP