Objectives_template

	Load/store Ordering Out-of-order load issue relies on speculative memory disambiguation Assumes that there will be no conflicting store If the speculation is correct, you have issued the load much earlier and you have allowed the dependents to also execute much earlier If there is a conflicting store, you have to squash the load and all the dependents that have consumed the load value and re-execute them systematically Turns out that the speculation is correct most of the time To further minimize the load squash, microprocessors use simple memory dependence predictors (predicts if a load is going to conflict with a pending store based on that load's or load/store pairs' past behavior) MLP and Memory Wall Today microprocessors try to hide cache misses by initiating early prefetches : Hardware prefetchers try to predict next several load addresses and initiate cache line prefetch if they are not already in the cache All processors today also support prefetch instructions; so you can specify in your program when to prefetch what: this gives much better control compared to a hardware prefetcher Researchers are working on load value prediction Even after doing all these, memory latency remains the biggest bottleneck Today microprocessors are trying to overcome one single wall: the memory wall