Module 14: "Directory-based Cache Coherence"
  Lecture 29: "Basics of Directory"
 

Directory overhead

  • Quadratic in number of processors for bitvector
    • Assume P processors, each with M amount of local memory (i.e. total shared memory size is M*P)
    • Let the coherence granularity (cache block size) be B
    • Number of cache blocks per node = M/B = number of directory entries per node
    • Size of one directory entry = P + O(1)
    • Total size of directory memory across all processors  = (M/B)(P+O(1))*P = O(P2)

Path of a read miss

  • Assume that the line is not shared by anyone
    • Load issues from load queue (for data) or fetcher accesses icache; looks up TLB and gets PA
    • Misses in L1, L2, L3,… caches
    • Launches address and request type on system bus
    • The request gets queued in memory controller and registered in OTT or TTT (Outstanding Transaction Table or Transactions in Transit Table)
    • Memory controller eventually schedules the request
    • Decodes home node from upper few bits of address
    • Local home: access directory and data memory (how?)
    • Remote home: request gets queued in network interface
  • From NI onward
    • Eventually the request gets forwarded to the router and through the network to the home
    • At the home the request gets queued in NI and waits for being scheduled by the home memory controller
    • After it is scheduled home memory controller looks up directory and data memory
    • Reply returns through the same path
  • Total time (by log model and memory latency m)
    • Local home: max(kho, m)
    • Remote home: kro + gh+a + Nℓ + gh+a + max(kho, m) + gh+a+d + Nℓ + gh+a+d + kro

Correctness issues

  • Serialization to a location
    • Schedule order at home
    • Use NACKs (extra traffic and livelock) or smarter techniques (back-off, NACK-free)
  • Flow control deadlock
    • Avoid buffer dependence cycles
    • Avoid network queue dependence cycles
    • Virtual networks multiplexed on physical networks
    • Coherence protocol dictates the virtual network usage