Module 3: "Recap: Single-threaded Execution"
  Lecture 6: "Instruction Issue Algorithms"
 

Alternative: VLIW

  • Very Long Instruction Word computers
    • Compiler carries out all dependence analysis
    • Bundles as many independent instructions as allowed by the number of functional units into an instruction packet
    • Hardware is a lot less complex
    • The instructions in the packet issue in parallel
    • Each packet of instructions is pretty long (hence the name)
    • Problem: compiler may not be able to extract as much ILP as a dynamic out-of-order core; many packets may go unutilized
  • Big leap from VLIW: EPIC (Explicitly Parallel Instruction Computing) [Itanium family]

Current research in μP

  • Micro-architectural techniques to extract more ILP
    • Directly helps improve IPC and reduce CPI
    • Various speculative techniques to hide cache miss latency: prefetching, load value prediction, etc.
  • Better branch prediction
    • Helps deep pipelines
  • Faster clocking
    • Need to cool the chip
    • Various techniques to reduce power consumption: clock gating, dynamic voltage/frequency scaling (DVFS), power-aware resource usage
    • Fighting the long wires: scaling micro-architectures against the complexity wall