Module 18: "TLP on Chip: HT/SMT and CMP"
  Lecture 40: "Case Studies: IBM Power4 and IBM Power5"
 

POWER4 L3 cache

  • On-chip tag (IBM calls it directory), off-chip data
    • 32 MB/8-way associative/512 bytes line
    • Contains eight coherence/snoop controllers
    • Does not maintain inclusion with L2: requires L3 to snoop fabric interconnect also
    • Maintains five coherence states
    • Putting the L3 cache on the other side of the fabric requires every L2 cache miss (even local miss) to cross the fabric: increases latency quite a bit

POWER4 die photo