Objectives_template

Shared address

Communication takes place through a logically shared portion of memory
- User interface is normal load/store instructions
- Load/store instructions generate virtual addresses
- The VAs are translated to PAs by TLB or page table
- The memory controller then decides where to find this PA
- Actual communication is hidden from the programmer
The general communication hw consists of multiple processors connected over some medium so that they can talk to memory banks and I/O devices
- The architecture of the interconnect may vary depending on projected cost and target performance
Communication medium
- Interconnect could be a crossbar switch so that any processor can talk to any memory bank in one “hop” (provides latency and bandwidth advantages)
- Scaling a crossbar becomes a problem: cost is proportional to square of the size
- Instead, could use a scalable switch-based network; latency increases and bandwidth decreases because now multiple processors contend for switch ports
Communication medium
- From mid 80s shared bus became popular leading to the design of SMPs
- Pentium Pro Quad was the first commodity SMP
- Sun Enterprise server provided a highly pipelined wide shared bus for scalability reasons; it also distributed the memory to each processor, but there was no local bus on the boards i.e. the memory was still “symmetric” (must use the shared bus)
- NUMA or DSM architectures provide a better solution to the scalability problem; the symmetric view is replaced by local and remote memory and each node (containing processor(s) with caches, memory controller and router) gets connected via a scalable network (mesh, ring etc.); Examples include Cray/SGI T3E, SGI Origin 2000, Alpha GS320, Alpha/HP GS1280 etc.