|
MOESI protocol
- Some SMPs implement MOESI today e.g., AMD Athlon MP and the IBM servers
- Why is the O state needed?
- O state is very similar to E state with four differences: 1. If a cache line is in O state in some cache, that cache is responsible for sourcing the line to the next requester; 2. The memory may not have the most up-to-date copy of the line (this implies 1); 3. Eviction of a line in O state generates a BusWB; 4. Write to a line in O state must generate a bus transaction
- When a line transitions from M to S it is necessary to write the line back to memory
- For a migratory sharing pattern (frequent in database workloads) this leads to a series of writebacks to memory
- These writebacks just keep the memory banks busy and consumes memory bandwidth
- Take the following example
- P0 reads x, P0 writes x, P1 reads x, P1 writes x, P2 reads x, P2 writes x, …
- Thus at the time of a BusRd response the memory will write the line back: one writeback per processor handover
- O state aims at eliminating all these writebacks by transitioning from M to O instead of M to S on a BusRd/Flush
- Subsequent BusRd requests are replied by the owner holding the line in O state
- The line is written back only when the owner evicts it: one single writeback
- State transitions pertaining to O state
- I to O: not possible (or maybe; see below)
- E to O or S to O: not possible
- M to O: on a BusRd/Flush (but no memory writeback)
- O to I: on CacheEvict/BusWB or {BusRdX,BusUpgr}/Flush
- O to S: not possible (or maybe; see below)
- O to E: not possible (or maybe if silent eviction not allowed)
- O to M: on PrWr/BusUpgr
- At most one cache can have a line in O state at any point in time
- Two main design choices for MOESI
- Consider the example P0 reads x, P0 writes x, P1 reads x, P2 reads x, P3 reads x, …
- When P1 launches BusRd, P0 sources the line and now the protocol has two options: 1. The line in P0 goes to O and the line in P1 is filled in state S; 2. The line in P0 goes to S and the line in P1 is filled in state O i.e. P1 inherits ownership from P0
- For bus-based SMPs the two choices will yield roughly the same performance
- For DSM multiprocessors we will revisit this issue if time permits
- According to the second choice, when P2 generates a BusRd request, P1 sources the line and transitions from O to S; P2 becomes the new owner
- Some SMPs do not support the E state
- In many cases it is not helpful, only complicates the protocol
- MOSI allows a compact state encoding in 2 bits
- Sun WildFire uses MOSI protocol
Dragon protocol
- An update-based protocol for writeback caches
- Four states: Two of them are standard E and M
- Shared clean (Sc): The standard S state
- Shared modified (Sm): This is really the O state
- In fact, five states because you always have I i.e. not in cache
- So really a MOESI update-based protocol
- New bus transaction: BusUpd
- Used to update part of cache line
- Distinguish between cache hits and misses:
- PrRd and PrWr are hits, PrRdMiss and PrWrMiss are misses
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|