Module 5: Performance Issues in Shared Memory and Introduction to Coherence
  Lecture 9: Performance Issues in Shared Memory
 


Worse: False Sharing

  • If the algorithm is designed so poorly that
    • Two processors write to two different words within a cache line at the same time
    • The cache line keeps on moving between two processors
    • The processors are not really accessing or updating the same element, but whatever they are updating happen to fall within a cache line: not a true sharing, but false sharing
    • For shared memory programs false sharing can easily degrade performance by a lot
    • Easy to avoid: just pad up to the end of the cache line before starting the allocation of the data for the next processor (wastes memory, but improves performance)

Contention

  • It is very easy to ignore contention effects when designing algorithms
    • Can severely degrade performance by creating hot-spots
  • Location hot-spot:
    • Consider accumulating a global variable; the accumulation takes place on a single node i.e. all nodes access the variable allocated on that particular node whenever it tries to increment it