Module 2: Virtual Memory and Caches
  Lecture 4: Cache Hierarchy and Memory-level Parallelism
 


MLP

  • Need memory-level parallelism (MLP)
    • Simply speaking, need to mutually overlap several memory operations
  • Step 1: Non-blocking cache
    • Allow multiple outstanding cache misses
    • Mutually overlap multiple cache misses
    • Supported by all microprocessors today (Alpha 21364 supported 16 outstanding cache misses)
  • Step 2: Out-of-order load issue
    • Issue loads out of program order (address is not known at the time of issue)
    • How do you know the load didn't issue before a store to the same address? Issuing stores must check for this memory-order violation

Out-of-order Loads

sw 0(r7), r6
… /* other instructions */
lw r2, 80(r20)

  • Assume that the load issues before the store because r20 gets ready before r6 or r7
  • The load accesses the store buffer (used for holding already executed store values before they are committed to the cache at retirement)
  • If it misses in the store buffer it looks up the caches and, say, gets the value somewhere
  • After several cycles the store issues and it turns out that 0(r7)==80(r20) or they overlap; now what?