Module 10: "Design of Shared Memory Multiprocessors"
  Lecture 18: "Introduction to Cache Coherence"
 

Definitions

  • Memory operation: a read (load), a write (store), or a read-modify-write
    • Assumed to take place atomically
  • A memory operation is said to issue when it leaves the issue queue and looks up the cache
  • A memory operation is said to perform with respect to a processor when a processor can tell that from other issued memory operations
    • A read is said to perform with respect to a processor when subsequent writes issued by that processor cannot affect the returned read value
    • A write is said to perform with respect to a processor when a subsequent read from that processor to the same address returns the new value

Ordering memory op

  • A memory operation is said to complete when it has performed with respect to all processors in the system
  • Assume that there is a single shared memory and no caches
    • Memory operations complete in shared memory when they access the corresponding memory locations
    • Operations from the same processor complete in program order: this imposes a partial order among the memory operations
    • Operations from different processors are interleaved in such a way that the program order is maintained for each processor: memory imposes some total order (many are possible)

Example

P0: x = 8; u = y; v = 9;

P1: r = 5; y = 4; t = v;

Legal total order:

x = 8; u = y; r = 5; y = 4; t = v; v = 9;

Another legal total order:

x = 8; r = 5; y = 4; u = y; v = 9; t = v;

  • “Last” means the most recent in some legal total order
  • A system is coherent if
    • Reads get the last written value in the total order
    • All processors see writes to a location in the same order

Cache coherence

  • Formal definition
    • A memory system is coherent if the values returned by reads to a memory location during an execution of a program are such that all operations to that location can form a hypothetical total order that is consistent with the serial order and has the following two properties:
    1. Operations issued by any particular processor perform according to the issue order
    2. The value returned by a read is the value written to that location by the last write in the total order
    • Two necessary features that follow from above:
    1. Write propagation: writes must eventually become visible to all processors
    2. Write serialization: Every processor should see the writes to a location in the same order (if I see w1 before w2, you should not see w2 before w1)

Bus-based SMP

  • Extend the philosophy of uniprocessor bus transactions
    • Three phases: arbitrate for bus, launch command (often called request) and address, transfer data
    • Every device connected to the bus can observe the transaction
    • Appropriate device responds to the request
    • In SMP, processors also observe the transactions and may take appropriate actions to guarantee coherence
    • The other device on the bus that will be of interest to us is the memory controller (north bridge in standard mother boards)
    • Depending on the bus transaction a cache block executes a finite state machine implementing the coherence protocol