Module 17: "Interconnection Networks"
  Lecture 37: "Introduction to Routers"
 

Fundamentals

  • The switches or the routers directly talk to the NI
  • The NI output and input queues normally map to the virtual channels of the connecting router
  • Topology
    • The structure of the interconnect network
    • Direct network: each router is attached to a complete node (most popular)
    • Indirect network: Nodes are attached to few routers only; other routers cannot generate packets, but can only forward them in right direction
  • Routing algorithms
    • Deterministic: fixed route between every pair of source and destination
    • Adaptive: based on congestion different routes may be selected dynamically
  • Switching strategy
    • Circuit switching: the path from source to destination is first established and reserved before the message is transmitted (popular in phone world, but not in PCA)
    • Packet switching: A message is divided into several packets and each packet carries routing information in its header; leads to better utilization of network resources since individual packets need to be routed only (as opposed to the entire message together)
  • Flow control
    • How to detect and avoid resource (buffer, channel, etc.) collision?
    • Minimum unit of information that can be transferred over a link at a time is called flit (flow control unit): may  be as small as a phit (physical unit) or as large as a message
  • Metrics to compare topology
    • Diameter: maximum shortest distance between any pair
    • Average distance: distance between two arbitrary nodes averaged over all pairs
    • Bisection bandwidth: aggregate bandwidth of minimum set of links which when removed leaves the network as two disjoint roughly equal collection of nodes
  • Packet structure
    • Header: contains routing and control information, e.g., source, destination, size of data payload, message opcode, etc.; an intermediate router only needs to inspect the header to handle a newly arrived packet
    • Address: for CC-NUMA machines the cache line address
    • Payload: transmitted data; for CC-NUMA machines this is normally a cache line, or uncached words, or empty;
    • Trailer: normally contains an error-checking code
  • Life of a message in CC-NUMA
    • Starts when the coherence protocol engine (residing in the memory controller) of source node queues the message into one of the NI output queues
    • NI outbound scheduler picks messages from the head of one of the queues possibly according to round-robin scheme
    • The selected message is assembled by NI outbound hardware and is queued in the outgoing virtual channel of the router port connected to NI (from router’s viewpoint this is an input port); any payload is copied into a message buffer of that port obeying the copy bandwidth
    • The scheduling algorithm of the router tries to match as many input ports to output ports; this forms the routing delay or hop time
    • The selected packets are pushed into the network obeying the node-to-network bandwidth
  • Latency of a message
    • Overhead in NI (at source and dest.) + hop time + channel occupancy (time to push into network) + contention (queuing delay at various places)
    • Store-and-forward routing: each intermediate switch stores the message completely before forwarding it to the next switch; uncontended latency (ignore overhead) = h(n/b+d) where h is the number of hops, n is the size of the message, d is the hop time, b is the node-to-network BW
    • Cut-through routing: as soon as the complete header arrives, routing decision is taken and there is no need to wait for all packets to arrive; uncontended optimistic latency = n/b + hd (much like circuit switching)

Couple of things to notice:

  • Time of flight (transmission delay through wires) is negligible
  • In cut-through routing formula, the assumption is that routing delay is bigger than channel occupancy of a phit
  • Contention control in cut-through routing
    • Virtual cut-through: buffer incoming packet if outgoing port is busy (in the worst case it behaves as store-and-forward)
    • Wormhole routing: allows buffering of few packets inside the router (the packets of a message stay blocked at several routers along the route like a worm)
  • General contention-control
    • What happens to incoming packets if router buffers are full?
    • General solution in data communication or in WAN is to drop packets and retry based on time-out (TCP/IP, ATM, etc.)
    • In parallel computers packets are normally not dropped; a link-level flow control blocks the packets in the last router’s output port: may cause tree saturation