Module 10: "Design of Shared Memory Multiprocessors"
  Lecture 19: "Sequential Consistency and Cache Coherence Protocols"
 

MSI protocol

  • Forms the foundation of invalidation-based writeback protocols
    • Assumes only three supported cache line states: I, S, and M
    • There may be multiple processors caching a line in S state
    • There must be exactly one processor caching a line in M state and it is the owner of the line
    • If none of the caches have the line, memory must have the most up-to-date copy of the line
  • Processor requests to cache: PrRd, PrWr
  • Bus transactions: BusRd, BusRdX, BusUpgr, BusWB

State transition

MSI protocol
  • Few things to note
    • Flush operation essentially launches the line on the bus
    • Processor with the cache line in M state is responsible for flushing the line on bus whenever there is a BusRd or BusRdX transaction generated by some other processor
    • On BusRd the line transitions from M to S, but not M to I. Why? Also at this point both the requester and memory pick up the line from the bus; the requester puts the line in its cache in S state while memory writes the line back. Why does memory need to write back?
    • On BusRdX the line transitions from M to I and this time memory does not need to pick up the line from bus. Only the requester picks up the line and puts it in M state in its cache. Why?
M to S, or M to I?
  • BusRd takes a cache line in M state to S state
    • The assumption here is that the processor will read it soon, so save a cache miss by going to S
    • May not be good if the sharing pattern is migratory: P0 reads and writes cache line A, then P1 reads and writes cache line A, then P2…
    • For migratory patterns it makes sense to go to I state so that a future invalidation is saved
    • But for bus-based SMPs it does not matter much because an upgrade transaction will be launched anyway by the next writer, unless there is special hardware support to avoid that: how?
    • The big problem is that the sharing pattern for a cache line may change dynamically: adaptive protocols are good and are supported by Sequent Symmetry and MIT Alewife

MSI example

  • Take the following example
    • P0 reads x, P1 reads x, P1 writes x, P0 reads x, P2 reads x, P3 writes x
    • Assume the state of the cache line containing the address of x is I in all processors

      P0 generates BusRd, memory provides line, P0 puts line in S state

      P1 generates BusRd, memory provides line, P1 puts line in S state

      P1 generates BusUpgr, P0 snoops and invalidates line, memory does not respond, P1 sets state of line to M

      P0 generates BusRd, P1 flushes line and goes to S state, P0 puts line in S state, memory writes back

      P2 generates BusRd, memory provides line, P2 puts line in S state

      P3 generates BusRdX, P0, P1, P2 snoop and invalidate, memory provides line, P3 puts line in cache in M state