Module 8: Memory Consistency Models and Case Studies of Multi-core
  Lecture 16: Case Studies of Multi-core
 


POWER4 L3 Cache

  • On-chip tag (IBM calls it directory), off-chip data
    • 32 MB/8-way associative/512 bytes line
    • Contains eight coherence/snoop controllers
    • Does not maintain inclusion with L2: requires L3 to snoop fabric interconnect also
    • Maintains five coherence states
    • Putting the L3 cache on the other side of the fabric requires every L2 cache miss (even local miss) to cross the fabric: increases latency quite a bit