Module 3: "Recap: Single-threaded Execution"
  Lecture 5: "Pipelining and Hazards"
 

 

WAW hazard

  • Write After Write (WAW)
  • The problem: out-of-order completion
  • The final value in r5 will nullify the effect of the add instruction
  • The bigger issue: precise exception is violated
  • Next load instruction raises an exception (may be due to page fault)
  • You handle the exception and start from the load
  • But value in r5 does not reflect precise state
  • Solution: disallow out-of-order completion

Overall CPI

  • CPI = 1.0 + pipeline overhead
  • Pipeline overhead comes from
    • Branch penalty (useless delay slots, mispredictions)
    • True data dependencies
    • Multi-cycle instructions (load/store, mult/div)
    • Other data hazards
  • So to boost CPI further
    • Need to have better branch prediction
    • Need to hide latency of memory ops, mult/div

Multiple issue

  • Thus far we have assumed that at most one instruction gets advanced to EX stage every cycle
  • If we have four ALUs we can issue four independent instructions every cycle
  • This is called superscalar execution
  • Ideally CPI should go down by a factor equal to issue width (more parallelism)
  • Extra hardware needed:
    • Wider fetch to keep the ALUs fed
    • More decode bandwidth, more register file ports; decoded instructions are put in an issue queue
    • Selection of independent instructions for issue
    • In-order completion