Module 4: Parallel Programming: Shared Memory and Message Passing
  Lecture 8: Optimizing Shared Memory Performance
 


Comm -to-comp Ratio

  • Surely, there could be many different domain decompositions for a particular problem
    • For grid solver we may have a square block decomposition, block row decomposition or cyclic row decomposition
    • How to determine which one is good? Communication-to-computation ratio

Assume P processors and NxN grid for grid solver

  • For block row decomposition
    • Each strip has N/P rows
    • Communication (boundary rows): 2N
    • Computation (area): N 2 /P (same as square block)
    • Comm -to-comp ratio: 2P/N
  • For cyclic row decomposition
    • Each processor gets N/P isolated rows
    • Communication: 2N 2 /P
    • Computation: N 2 /P
    • Comm-to-comp ratio: 2
  • Normally N is much much larger than P
    • Asymptotically, square block yields lowest comm -to-comp ratio