|
Preparing to issue
- Finally, during the second stage every instruction is assigned an active list entry
- The active list is a 32-entry FIFO queue which keeps track of all in-flight instructions (at most 32) in-order
- Each entry contains various info about the allocated instruction such as physical dest reg number etc.
- Also, each instruction is assigned to one of the three issue queues depending on its type
- Integer queue: holds integer ALU instructions
- Floating-point queue: holds FPU instructions
- Address queue: holds the memory operations
- Therefore, stage 2 may stall if the processor runs out of: active list entries, physical regs, issue queue entries
Stage 3: Issue
- Three issue queue selection logics work in parallel
- Integer and fp queue issue logics are similar
- Integer issue logic
- Integer queue contains 16 entries (can hold at most 16 instructions)
- Search for ready-to-issue instructions among these 16
- Issue at most two instructions to two ALUs
- Address queue
- Slightly more complicated
- When a load or a store is issued the address is still not known
- To simplify matters, R10000 issues load/stores in-order (we have seen problems associated with out-of-order load/store issue)
Load-dependents
- The loads take two cycles to execute
- During the first cycle the address is computed
- During the second cycle the dTLB and data cache are accessed
- Ideally I want to issue an instruction dependent on the load so that the instruction can pick up the load value from the bypass just in time
- Assume that a load issues in cycle 0, computes address in cycle 1, and looks up cache in cycle 2
- I want to issue the dependent in cycle 2 so that it can pick up the load value just before executing in cycle 3
- Thus the load looks up cache in parallel with the issuing of the dependent; the dependent is issued even before it is known whether the load will hit in the cache; this is called load hit speculation (re-execute later if the load misses)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|