| |
Page migration
- Page migration changes the existing VA to PA mapping of the migrated page
- Requires notifying all TLBs caching the old mapping
- Introduces a TLB coherence problem
- Origin 2000 uses a smart page migration algorithm: allows the page copy and TLB shootdown to proceed in parallel
- Array of 64 page reference counters per directory entry to decide whether to migrate a page or not: compare requester’s counter against home’s and send an interrupt to home if migration is required
- What does the interrupt handler do?
- Access all directory entries of the lines belonging to the to-be migrated page
- Send invalidations to sharers or interventions to owners; at the end all cache lines of that page must be in memory
- Set the poison bits in the directory entries of all the cache lines of the page
- Start a block transfer of the page from home to requester at this point (30 μs to copy 16 KB)
- An access to a poisoned cache line from a node results in a bus error which invalidates the TLB entry for that page in the requesting node (avoids broadcast shootdown)
- Until the page is completely migrated and is assigned a physical page frame on target node, all nodes accessing a poisoned line wait in a pending queue
- After the page copy is completed the waiting nodes are served one by one; however, the directory entries and the page itself are moved to a “poisoned list” and are not yet freed at the home (i.e. you still cannot use that physical page frame)
- On every scheduler tick the kernel invalidates one TLB entry per processor
- After a time equal to TLB entries per processor multiplied by scheduling quantum the page frame is marked free and is removed from the poisoned list
- Major advantage: requesting nodes only see the page copy latency including invalidation and interventions in critical path, but not the TLB shootdown latency
Queue lock in hardware
- Stanford DASH
- Memory controller recognizes lock accesses
- Requires changes in compiler and instruction set
- Marks the directory entry with contenders
- On unlock a contender is chosen and lock is granted to that node
- Unlock is forced to generate a notification message to home
- Possibly requires special cache state for lock variables or special uncached instructions for unlock if lock variables are not allowed to be cached
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
|
|
|
|