|
Directory overhead
- Directory overhead
- Need 6 bits to maintain the head node id
- NUMA-Q scales up to 64 nodes
- Need 2 bits for encoding three states: HOME, FRESH, GONE
- A system with P nodes, M bytes of memory, and cache block size of B bytes has M/B cache blocks per node
- 2 + log(P) bits needed for directory entry per cache block
- Total overhead = (M/B)*(log (P) + O(1))*P
- O(P*log(P))
Cache overhead
- Extended RAC tags for storing upstream and downstream pointers
- 2*log(P) per cache block
- Total increased tag DRAM area is O(P*log(P))
Handling read miss
- Requester on missing the RAC as well as quad snoop sends a read request to home
- Allocates a block in RAC and marks its state PENDING
- CASE A: directory is HOME state
- Change directory state to FRESH
- Change head pointer to requester id
- Send reply to requester
- Requester fills cache block in RAC, forwards it to requesting processor, changes RAC block state to ONLY_FRESH
- CASE B: directory state is FRESH
- Home changes head pointer to requester id
- Sends reply with data read from memory and the old head node id
- Requester sends a request to the previous head expressing intention to become the new head
- Old head changes its upstream pointer to point to the requester and the RAC state to MID_VALID or TAIL_VALID; sends an acknowledgment to requester
- Requester changes its downstream pointer to old head and upstream pointer to home; also changes RAC line state to HEAD_FRESH
- Observe the strict request-reply nature of the protocol
- CASE C: directory state is GONE
- Means head node has an exclusive copy of the cache line
- Home replies to the requester with the head node id, but does not change the state of the directory
- Requester sets RAC line state to PENDING and sends a data request to the head node
- Old head changes RAC line state to TAIL_VALID, sets its upstream pointer to the requester, and sends data to requester
- Requester sets RAC line state to HEAD_DIRTY, sets its upstream pointer to home and downstream pointer to old head
- Note that directory remains in GONE state and memory is not updated (similar to an M to O transition)
- Handling races
- Suppose when the requester’s (say A) message reaches the old head (say B) the RAC line is in PENDING state
- SCI doesn’t have any pending state in directory or doesn’t use NACKs (actually uses, but small in number)
- B does become the new head (has to because the home has already updated the directory), but inherits the PENDING state from A
- Any subsequent request will come to B and will become the new pending head
- Ultimately the PENDING state is resolved along the chain starting from A upstream
- FIFO nature of the pending list guarantees fairness
- Also, no problem related to sizing the buffers for holding pending requests (no extra space needed
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|