Module 14: "Directory-based Cache Coherence"
  Lecture 31: "Managing Directory Overhead"
 

Replacement of S blocks

  • Send notification to directory?
    • Can save a future invalidation
    • Does it reduce overall traffic?
  • Origin 2000 does not use replacement hints
    • No notification to directory
    • Why?
  • Replacements of E blocks are hinted and require acknowledgments also (why?)
  • Summary of transaction types
    • Coherence: 9 request transaction types, 6 invalidation/intervention, 39 reply types
    • Non-coherent (I/O, synch, special): 19 requests, 14 replies

Serialization

  • Home is used to serialize requests
    • The order determined by the home is final
    • No node should violate this order
    • Example: read-invalidate races
      • P0, P1, and P2 are trying to access a cache block
      • P0 and P2 want to read while P1 wants to write
      • The requests from P0 and P2 reach home first, home replies and marks both in sharer vector; but the reply message to P0 gets delayed in the network
      • P1’s write causes home to send out invalidation to P0 and P2; P0’s inv. reaches P0 before the read reply
      • P0’s hub sends acknowledgment to P1 and also forwards the invalidation to P0’s processor cache
      • What happens when P0’s reply arrives? Can the data be used?
  • Requester’s viewpoint
    • When a read reply arrives it finds the OTT entry has the “inv” bit set
    • Under what conditions can it happen?
      • Seen one in the last slide
    • Can replacement hints help?
  • What about upgrade-invalidation races?
  • What about readX-invalidation races?

VN deadlock

  • Origin 2000 has only two virtual networks, but has three-hop transactions
    • Resorts to back-off invalidate or intervention to fall back to strict request-reply
    • Does it really solve the problem or just move the problem elsewhere?
  • Stanford DASH has same problems
    • Uses NACKs after a time-out period if the outgoing network doesn’t free up
    • Worse compared to Origin because NACKs inflate total traffic and may lead to livelock
    • DASH avoids livelocks by sizing the queues according to the machine size (not a scalable solution)