Module 5: "MIPS R10000: A Case Study"
  Lecture 9: "MIPS R10000: A Case Study"
 

Overview

  • Mid 90s: One of the first dynamic out-of-order superscalar RISC microprocessors
  • 6.8 M transistors on 298 mm2 die (0.35 μm CMOS)
  • Out of 6.8 M transistors 4.4 M are devoted to L1 instruction and data caches
  • Fetches, decodes, renames 4 instructions every cycle
  • 64-bit registers: the data path width is 64 bits
  • On-chip 32 KB L1 instruction and data caches, 2-way set associative
  • Off-chip L2 cache of variable size (512 KB to 16 MB), 2-way set associative, line size 128 bytes

Stage 1: Fetch

  • The instructions are slightly pre-decoded when the cache line is brought into Icache
    • Simplifies the decode stage
  • Processor fetches four sequential instructions every cycle from the Icache
  • The iTLB has eight entries, fully associative
  • No BTB
  • So the fetcher really cannot do anything about branches other than fetching sequentially

Stage 2: Decode/Rename

  • Decodes and renames four instructions every cycle
  • The targets of branches, unconditional jumps, and subroutine calls (named jump and link or jal) are computed in this stage
  • Unconditional jumps are not fed into the pipeline and the fetcher PC is modified directly by the decoder
  • Conditional branches look up a simple predictor to predict the branch direction (taken or not taken) and accordingly modify the fetch PC