Objectives_template

	Power Consumption? Hey, didn't I just make my power consumption roughly N-fold by putting N cores on the die? Yes, if you do not scale down voltage or frequency Usually CMPs are clocked at a lower frequency Oops! My games run slower! Voltage scaling happens due to smaller process technology Overall, roughly cubic dependence of power on voltage or frequency Need to talk about different metrics Performance/Watt (same as reciprocal of energy) More general, Performance k+1 /Watt (k > 0) Need smarter techniques to further improve these metrics Online voltage/frequency scaling ABCs of CMP Where to put the interconnect? Do not want to access the interconnect too frequently because these wires are slow It probably does not make much sense to have the L1 cache shared among the cores: requires very high bandwidth and may necessitate a redesign of the L1 cache and surrounding load/store unit which we do not want to do; so settle for private L1 caches, one per core Makes more sense to share the L2 or L3 caches Need a coherence protocol at L2 interface to keep private L1 caches coherent: may use a high-speed custom designed snoopy bus connecting the L1 controllers or may use a simple directory protocol An entirely different design choice is not to share the cache hierarchy at all (dual-core AMD and Intel): rids you of the on-chip coherence protocol, but no gain in communication latency