1. Consider the following memory organization of a processor. The virtual
address is 40 bits, the physical address is 32 bits, the page size is 8 KB.
The processor has a 4-way set associative 128-entry TLB i.e. each way has 32
sets. Each page table entry is 32 bits in size. The processor also has a 2-way
set associative 32 KB L1 cache with line size of 64 bytes.
(A) What is the total size of the page table?
(B) Clearly show (with the help of a diagram) the addressing scheme if the
cache is virtually indexed and physically tagged. Your diagram should show the
width of TLB and cache tags.
(C) If the cache was physically indexed and physically tagged, what part of
the addressing scheme would change?
2. A set associative cache has longer hit time than an equally sized
direct-mapped cache. Why?
3. The Alpha 21264 has a virtually indexed virtually tagged instruction cache.
Do you see any security/protection issues with this? If yes, explain and offer
a solution. How would you maintain correctness of such a cache in a
multi-programmed environment?
4. Consider the following segment of C code for adding the elements in each
column of an NxN matrix A and putting it in a vector x of size N.
for(j=0;j<N;j++) {
for(i=0;i<N;i++) {
x[j] += A[i][j];
}
}
Assume that the C compiler carries out a row-major layout of matrix A i.e.
A[i][j] and A[i][j+1] are adjacent to each other in memory for all i and j in
the legal range and A[i][N-1] and A[i+1][0] are adjacent to each other for all
i in the legal range. Assume further that each element of A and x is a
floating point double i.e. 8 bytes in size. This code is executed on a modern
speculative out-of-order processor with the following memory hierarchy: page
size 4 KB, fully associative 128-entry data TLB, 32 KB 2-way set associative
single level data cache with 32 bytes line size, 256 MB DRAM. You may assume
that the cache is virtually indexed and physically tagged, although this
information is not needed to answer this question. For N=8192, compute the
following (please show all the intermediate steps). Assume that every
instruction hits in the instruction cache. Assume LRU replacement policy for
physical page frames, TLB entries, and cache sets.
(A) Number of page faults.
(B) Number of data TLB misses.
(C) Number of data cache misses. Assume that x and A do not conflict with each
other in the cache.
(D) At most how many memory operations can the processor overlap before coming
to a halt? Assume that the instruction selection logic (associated with the
issue unit) gives priority to older instructions over younger instructions if
both are ready to issue in a cycle.
5. Suppose you are running a program on two machines, both having a single
level of cache hierarchy (i.e. only L1 caches). In one machine the cache is
virtually indexed and physically tagged while in the other it is physically
indexed and physically tagged. Will there be any difference in cache miss
rates when the program is run on these two machines?