|
OOO and SC
- Consider a simple example (all are zero initially)
P0: x=w+1; r=y+1;
P1: y=2; w=y+1;
- What went wrong?
SC example
- Consider the following example
P0: A=1; print B;
P1: B=1; print A;
- Possible outcomes for an SC machine
- (A, B) = (0,1); interleaving: B=1; print A; A=1; print B
- (A, B) = (1,0); interleaving: A=1; print B; B=1; print A
- (A, B) = (1,1); interleaving: A=1; B=1; print A; print B
A=1; B=1; print B; print A
- (A, B) = (0,0) is impossible: read of A must occur before write of A and read of B must occur before write of B i.e. print A < A=1 and print B < B=1, but A=1 < print B and B=1 < print A; thus print B < B=1 < print A < A=1 < print B which implies print B < print B, a contradiction
Implementing SC
- Two basic requirements
- Memory operations issued by a processor must become visible to others in program order
- Need to make sure that all processors see the same total order of memory operations: in the previous example for the (0,1) case both P0 and P1 should see the same interleaving: B=1; print A; A=1; print B
- The tricky part is to make sure that writes become visible in the same order to all processors
- Write atomicity: as if each write is an atomic operation
- Otherwise, two processors may end up using different values (which may still be correct from the viewpoint of cache coherence, but will violate SC)
Write atomicity
- Example (A=0, B=0 initially)
P0: A=1;
P1: while (!A); B=1;
P2: while (!B); print A;
- A correct execution on an SC machine should print A=1
- A=0 will be printed only if write to A is not visible to P2, but clearly it is visible to P1 since it came out of the loop
- Thus A=0 is possible if P1 sees the order A=1 < B=1 and P2 sees the order B=1 < A=1 i.e. from the viewpoint of the whole system the write A=1 was not “atomic”
- Without write atomicity P2 may proceed to print 0 with a stale value from its cache
Summary of SC
- Program order from each processor creates a partial order among memory operations
- Interleaving of these partial orders defines a total order
- Sequential consistency: one of many total orders
- A multiprocessor is said to be SC if any execution on this machine is SC compliant
- Sufficient but not necessary conditions for SC
- Issue memory operation in program order
- Every processor waits for write to complete before issuing the next operation
- Every processor waits for read to complete and the write that affects the returned value to complete before issuing the next operation (important for write atomicity)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|