Objectives_template

	Long history Starting from long cycle/multi-cycle execution Big leap: pipelining Started with single issue Matured into multiple issue Next leap: speculative execution Out-of-order issue, in-order completion Today’s microprocessors feature Speculation at various levels during execution Deep pipelining Sophisticated branch prediction And many more performance boosting hardware Single-threaded execution Goal of a microprocessor Given a sequential set of instructions it should execute them correctly as fast as possible Correctness is guaranteed as long as the external world sees the execution in-order (i.e. sequential) Within the processor it is okay to re-order the instructions as long as the changes to states are applied in-order Performance equation Execution time = average CPI × number of instructions × cycle time CPI equation: analysis To reduce the execution time we can try to lower one or more the three terms Reducing average CPI (cycles per instruction): The starting point could be CPI=1 But complex arithmetic operations e.g. multiplication/division take more than a cycle Memory operations take even longer So normally average CPI is larger than 1 How to reduce CPI is the core of this lecture Reducing number of instructions Better compiler, smart instruction set architecture (ISA) Reducing cycle time: faster clock Life of an instruction Fetch from memory Decode/read (figure out the opcode, source and dest registers, read source registers) Execute (ALUs, address calculation for memory op) Memory access (for load/store) Writeback or commit (write result to destination reg) During execution the instruction may talk to Register file (for reading source operands and writing results) Cache hierarchy (for instruction fetch and for memory op)