Objectives_template

	MOESI protocol Some SMPs implement MOESI today e.g., AMD Athlon MP and the IBM servers Why is the O state needed? O state is very similar to E state with four differences: 1. If a cache line is in O state in some cache, that cache is responsible for sourcing the line to the next requester; 2. The memory may not have the most up-to-date copy of the line (this implies 1); 3. Eviction of a line in O state generates a BusWB; 4. Write to a line in O state must generate a bus transaction When a line transitions from M to S it is necessary to write the line back to memory For a migratory sharing pattern (frequent in database workloads) this leads to a series of writebacks to memory These writebacks just keep the memory banks busy and consumes memory bandwidth Take the following example P0 reads x, P0 writes x, P1 reads x, P1 writes x, P2 reads x, P2 writes x, … Thus at the time of a BusRd response the memory will write the line back: one writeback per processor handover O state aims at eliminating all these writebacks by transitioning from M to O instead of M to S on a BusRd/Flush Subsequent BusRd requests are replied by the owner holding the line in O state The line is written back only when the owner evicts it: one single writeback State transitions pertaining to O state I to O: not possible (or maybe; see below) E to O or S to O: not possible M to O: on a BusRd/Flush (but no memory writeback) O to I: on CacheEvict/BusWB or {BusRdX,BusUpgr}/Flush O to S: not possible (or maybe; see below) O to E: not possible (or maybe if silent eviction not allowed) O to M: on PrWr/BusUpgr At most one cache can have a line in O state at any point in time Two main design choices for MOESI Consider the example P0 reads x, P0 writes x, P1 reads x, P2 reads x, P3 reads x, … When P1 launches BusRd, P0 sources the line and now the protocol has two options: 1. The line in P0 goes to O and the line in P1 is filled in state S; 2. The line in P0 goes to S and the line in P1 is filled in state O i.e. P1 inherits ownership from P0 For bus-based SMPs the two choices will yield roughly the same performance For DSM multiprocessors we will revisit this issue if time permits According to the second choice, when P2 generates a BusRd request, P1 sources the line and transitions from O to S; P2 becomes the new owner Some SMPs do not support the E state In many cases it is not helpful, only complicates the protocol MOSI allows a compact state encoding in 2 bits Sun WildFire uses MOSI protocol Dragon protocol An update-based protocol for writeback caches Four states: Two of them are standard E and M Shared clean (Sc): The standard S state Shared modified (Sm): This is really the O state In fact, five states because you always have I i.e. not in cache So really a MOESI update-based protocol New bus transaction: BusUpd Used to update part of cache line Distinguish between cache hits and misses: PrRd and PrWr are hits, PrRdMiss and PrWrMiss are misses