|
Long history
- Starting from long cycle/multi-cycle execution
- Big leap: pipelining
- Started with single issue
- Matured into multiple issue
- Next leap: speculative execution
- Out-of-order issue, in-order completion
- Today’s microprocessors feature
- Speculation at various levels during execution
- Deep pipelining
- Sophisticated branch prediction
- And many more performance boosting hardware
Single-threaded execution
- Goal of a microprocessor
- Given a sequential set of instructions it should execute them correctly as fast as possible
- Correctness is guaranteed as long as the external world sees the execution in-order (i.e. sequential)
- Within the processor it is okay to re-order the instructions as long as the changes to states are applied in-order
- Performance equation
- Execution time = average CPI × number of instructions × cycle time
CPI equation: analysis
- To reduce the execution time we can try to lower one or more the three terms
- Reducing average CPI (cycles per instruction):
- The starting point could be CPI=1
- But complex arithmetic operations e.g. multiplication/division take more than a cycle
- Memory operations take even longer
- So normally average CPI is larger than 1
- How to reduce CPI is the core of this lecture
- Reducing number of instructions
- Better compiler, smart instruction set architecture (ISA)
- Reducing cycle time: faster clock
Life of an instruction
- Fetch from memory
- Decode/read (figure out the opcode, source and dest registers, read source registers)
- Execute (ALUs, address calculation for memory op)
- Memory access (for load/store)
- Writeback or commit (write result to destination reg)
- During execution the instruction may talk to
- Register file (for reading source operands and writing results)
- Cache hierarchy (for instruction fetch and for memory op)
|