|
Virtual network:
Case Studies
- Each virtual network consists of an NI queue in each direction connected to the corresponding queue or group of queues in the router
- SGI Origin 2000
- Two virtual networks; uses back-off intervention and invalidation to avoid cycles in the network dependence graph
- Stanford DASH
- Two virtual networks; in case an incoming request needs space in outgoing request network and outgoing request queue is full, it waits for a pre-defined number of cycles and then if still full, sends a NACK to the requester
- AlphaServer GS320
- Three virtual networks; longest transaction is 3-hop
- Stanford FLASH
- Four virtual networks; longest transaction is 4-hop (special case of reply generating a reply)
- Alpha 21364 router
- 19 virtual channels (essentially queues) in each direction per port: 3 channels per virtual network, six coherence message types, one extra channel forms the seventh virtual network to carry some special coherence control messages (3 channels within a network are used for adaptive routing)
Coherence controller occupancy
- How long does it take to service a message on average?
- If you imagine the coherence controller as a centralized server in a queuing model, occupancy is just the reciprocal of service rate
- Occupancy of servicing a message induces a waiting time on the subsequent messages (shows up as a contention component in the total end-to-end latency)
- Queuing analysis and simulation show that contention grows faster than quadratic in occupancy (Chaudhuri et al, 2003); later empirically confirmed by other researchers that it is likely to be sub-cubic
- Goal should be to design low-occupancy protocols
Protocol occupancy
- Goal is to design low-occupancy protocol
- Doesn’t mean cannot do smart things
- A high-occupancy protocol can still perform well if it can reduce the message count accordingly
- Latency tolerating techniques such as prefetching usually puts more pressure on the coherence controller (why?)
- Leads to an increased average protocol occupancy
- Some bad protocol decisions
- Invalidation acknowledgments at home
- Replacement hints
- NACKs
- Final design is usually influenced by directory organization and coherence controller microarchitecture
Directory controllers
- Two main designs
- Hardwired finite state machines (fixed protocol)
- Software protocol running on embedded protocol processor in memory controller (suited for off-chip memory controllers) or protocol thread in main processor (suited for multi-threaded processors) or protocol core in main processor (suited for multi-core processors)
- Hardwired FSM
- Low occupancy (all-hardware)
- Protocol must be simple enough to be able to design and verify in hardware
- Possible to pipeline various stages of protocol processing
- Cannot afford late-binding or flexibility in the choice of protocol
- SGI Origin 2000, MIT Alewife, Stanford DASH
|