Objectives_template

	Why SDSM? Hardware DSM is hard to design Must have tightly integrated communication assist and NI The CA should probably be custom designed for performance Expensive in terms of time to market and the amount of custom design in memory system But still want to retain shared memory programming Software DSM Provides shared virtual memory (SVM) over message passing programs Just take the commodity nodes, connect them over a commodity high-speed network, augment commodity OS with an SVM kernel, and port your shared memory programs to SVM Coherence granularity is a page SVM for dummy Embed a coherence protocol in the page fault handler On a page fault, figure out if the page is mapped on some other node If yes, get a copy of the page and map it in local memory in some free page frame and return from interrupt If no, swap it in from disk and map it as usual If it was a page fault generated by a load, set only read permission in the PTE; subsequent write will generate another access fault and then you invalidate all copies in the system Multiple nodes are allowed to have a virtual page mapped at different physical frames locally; thus the sharing really happens in the virtual address space and physical address space is private SVM overheads Performance factors Every protocol invocation requires an interrupt and context switch Messages are sent through message passing libraries as opposed to specialized NI The entire protocol runs in software; there is no hardware support Even remote requests interrupt local processes and pollute local caches due to protocol processing The granularity of coherence is too big; causes unnecessary communication and false sharing This last point was the major problem when such systems took off; attempts to limit false sharing and communication volume led to numerous innovations in SDSM coherence protocols Use of RC A good place to make use of relaxed models With SC there is no other choice but to invalidate all sharers and wait for all acknowledgments on every write to a page; immediately the invalidated readers may proceed to bring the page back and performance will degrade sharply SDSM systems invariably advertise RC or WO or some other relaxed model, but not SC Under WO since all accesses between synchronization points can be re-ordered arbitrarily, the writer can hold back all write notices (i.e. invalidations) until that point For RC this needs to be done only at release boundaries Note how different the use of RC is from hardware DSM; there RC is used to hide write latency and invalidations are sent immediately; here RC is used to limit communication (close to delayed consistency)