Module 6: "Fundamentals of Parallel Computers"
  Lecture 10: "Communication Architecture"
 

Shared address

  • Communication takes place through a logically shared portion of memory
    • User interface is normal load/store instructions
    • Load/store instructions generate virtual addresses
    • The VAs are translated to PAs by TLB or page table
    • The memory controller then decides where to find this PA
    • Actual communication is hidden from the programmer
  • The general communication hw consists of multiple processors connected over some medium so that they can talk to memory banks and I/O devices
    • The architecture of the interconnect may vary depending on projected cost and target performance
  • Communication medium
    • Interconnect could be a crossbar switch so that any processor can talk to any memory bank in one “hop” (provides latency and bandwidth advantages)
    • Scaling a crossbar becomes a problem: cost is proportional to square of the size
    • Instead, could use a scalable switch-based network; latency increases and bandwidth decreases because now multiple processors contend for switch ports
  • Communication medium
    • From mid 80s shared bus became popular leading to the design of SMPs
    • Pentium Pro Quad was the first commodity SMP
    • Sun Enterprise server provided a highly pipelined wide shared bus for scalability reasons; it also distributed the memory to each processor, but there was no local bus on the boards i.e. the memory was still “symmetric” (must use the shared bus)
    • NUMA or DSM architectures provide a better solution to the scalability problem; the symmetric view is replaced by local and remote memory and each node (containing processor(s) with caches, memory controller and router) gets connected via a scalable network (mesh, ring etc.); Examples include Cray/SGI T3E, SGI Origin 2000, Alpha GS320, Alpha/HP GS1280 etc.