|
Why SDSM?
- Hardware DSM is hard to design
- Must have tightly integrated communication assist and NI
- The CA should probably be custom designed for performance
- Expensive in terms of time to market and the amount of custom design in memory system
- But still want to retain shared memory programming
- Software DSM
- Provides shared virtual memory (SVM) over message passing programs
- Just take the commodity nodes, connect them over a commodity high-speed network, augment commodity OS with an SVM kernel, and port your shared memory programs to SVM
- Coherence granularity is a page
SVM for dummy
- Embed a coherence protocol in the page fault handler
- On a page fault, figure out if the page is mapped on some other node
- If yes, get a copy of the page and map it in local memory in some free page frame and return from interrupt
- If no, swap it in from disk and map it as usual
- If it was a page fault generated by a load, set only read permission in the PTE; subsequent write will generate another access fault and then you invalidate all copies in the system
- Multiple nodes are allowed to have a virtual page mapped at different physical frames locally; thus the sharing really happens in the virtual address space and physical address space is private
SVM overheads
- Performance factors
- Every protocol invocation requires an interrupt and context switch
- Messages are sent through message passing libraries as opposed to specialized NI
- The entire protocol runs in software; there is no hardware support
- Even remote requests interrupt local processes and pollute local caches due to protocol processing
- The granularity of coherence is too big; causes unnecessary communication and false sharing
- This last point was the major problem when such systems took off; attempts to limit false sharing and communication volume led to numerous innovations in SDSM coherence protocols
Use of RC
- A good place to make use of relaxed models
- With SC there is no other choice but to invalidate all sharers and wait for all acknowledgments on every write to a page; immediately the invalidated readers may proceed to bring the page back and performance will degrade sharply
- SDSM systems invariably advertise RC or WO or some other relaxed model, but not SC
- Under WO since all accesses between synchronization points can be re-ordered arbitrarily, the writer can hold back all write notices (i.e. invalidations) until that point
- For RC this needs to be done only at release boundaries
- Note how different the use of RC is from hardware DSM; there RC is used to hide write latency and invalidations are sent immediately; here RC is used to limit communication (close to delayed consistency)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|