Module 3: Fundamentals of Parallel Computers: ILP vs TLP
  Lecture 6: Preliminaries of Parallel Programming
 


Decomposition

  • Look for concurrency in loop iterations
    • In this case iterations are really dependent
    • Iteration ( i , j) depends on iterations ( i , j-1) and (i-1, j)

 

    • Each anti-diagonal can be computed in parallel
    • Must synchronize after each anti-diagonal (or pt-to-pt)
    • Alternative: red-black ordering (different update pattern)
  • Can update all red points first, synchronize globally with a barrier and then update all black points
    • May converge faster or slower compared to sequential program
    • Converged equilibrium may also be different if there are multiple solutions
    • Ocean simulation uses this decomposition
  • We will ignore the loop-carried dependence and go ahead with a straight-forward loop decomposition
    • Allow updates to all points in parallel
    • This is yet another different update order and may affect convergence
    • Update to a point may or may not see the new updates to the nearest neighbors (this parallel algorithm is non-deterministic)