Objectives_template

Multi-level caches

Split-transaction bus makes the design of multi-level caches a little more difficult
- The usual design is to have queues between levels of caches in each direction
- How do you size the queues? Between processor and L1 one buffer is sufficient (assume one outstanding processor access), L1-to-L2 needs P+1 buffers (why?), L2-to-L1 needs P buffers (why?), L1 to processor needs one buffer
- With smaller buffers there is a possibility of deadlock: suppose the L1-to-L2 and L2-to-L1 have one queue entry each, there is a request in L1-to-L2 queue and there is also an intervention in L2-to-L1 queue; clearly L1 cannot pick up the intervention because it does not have space to put the reply in L1-to-L2 queue while L2 cannot pick up the request because it might need space in L2-to-L1 queue in case of an L2 hit
Formalizing the deadlock with dependence graph
- There are four types of transactions in the cache hierarchy: 1. Processor requests (outbound requests), 2. Responses to processor requests (inbound responses), 3. Interventions (inbound requests), 4. Intervention responses (outbound responses)
- Processor requests need space in L1-to-L2 queue; responses to processors need space in L2-to-L1 queue; interventions need space in L2-to-L1 queue; intervention responses need space in L1-to-L2 queue
- Thus a message in L1-to-L2 queue may need space in L2-to-L1 queue (e.g. a processor request generating a response due to L2 hit); also a message in L2-to-L1 queue may need space in L1-to-L2 queue (e.g. an intervention response)
- This creates a cycle in queue space dependence graph

Dependence graph

Represent a queue by a vertex in the graph
- Number of vertices = number of queues
A directed edge from vertex u to vertex v is present if a message at the head of queue u may generate another message which requires space in queue v
In our case we have two queues
- L2-L1 and L1-L2; the graph is not a DAG, hence deadlock

Multi-level caches

In summary
- L2 cache controller refuses to drain L1-to-L2 queue if there is no space in L2-to-L1 queue; this is rather conservative because the message at the head of L1-to-L2 queue may not need space in L2-to-L1 queue e.g., in case of L2 miss or if it is an intervention reply; but after popping the head of L1-to-L2 queue it is impossible to backtrack if the message does need space in L2-to-L1 queue
- Similarly, L1 cache controller refuses to drain L2-to-L1 queue if there is no space in L1-to-L2 queue
- How do we break this cycle?
- Observe that responses for processor requests are guaranteed not to generate any more messages and intervention requests do not generate new requests, but can only generate replies
Solving the queue deadlock
- Introduce one more queue in each direction i.e. have a pair of queues in each direction
- L1-to-L2 processor request queue and L1-to-L2 intervention response queue
- Similarly, L2-to-L1 intervention request queue and L2-to-L1 processor response queue
- Now L2 cache controller can serve L1-to-L2 processor request queue as long as there is space in L2-to-L1 processor response queue, but there is no constraint on L1 cache controller for draining L2-to-L1 processor response queue
- Similarly, L1 cache controller can serve L2-to-L1 intervention request queue as long as there is space in L1-to-L2 intervention response queue, but L1-to-L2 intervention response queue will drain as soon as bus is granted