|
Cache Hierarchy
- Ideally want to hold everything in a fast cache
- Never want to go to the memory
- But, with increasing size the access time increases
- A large cache will slow down every access
- So, put increasingly bigger and slower caches between the processor and the memory
- Keep the most recently used data in the nearest cache: register file (RF)
- Next level of cache: level 1 or L1 (same speed or slightly slower than RF, but much bigger)
- Then L2: way bigger than L1 and much slower
- Example: Intel Pentium 4 ( Netburst )
- 128 registers accessible in 2 cycles
- L1 date cache: 8 KB, 4-way set associative, 64 bytes line size, accessible in 2 cycles for integer loads
- L2 cache: 256 KB, 8-way set associative, 128 bytes line size, accessible in 7 cycles
- Example: Intel Itanium 2 (code name Madison)
- 128 registers accessible in 1 cycle
- L1 instruction and data caches: each 16 KB, 4-way set associative, 64 bytes line size, accessible in 1 cycle
- Unified L2 cache: 256 KB, 8-way set associative, 128 bytes line size, accessible in 5 cycles
- Unified L3 cache: 6 MB, 24-way set associative, 128 bytes line size, accessible in 14 cycles
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|