Objectives_template

Decomposition of Iterative Equation Solver

Look for concurrency in loop iterations
- In this case iterations are really dependent
- Iteration (i, j) depends on iterations (i, j-1) and (i-1, j)
- Each anti-diagonal can be computed in parallel
- Must synchronize after each anti-diagonal (or pt-to-pt)
- Alternative: red-black ordering (different update pattern)
Can update all red points first, synchronize globally with a barrier and then update all black points
- May converge faster or slower compared to sequential program
- Converged equilibrium may also be different if there are multiple solutions
- Ocean simulation uses this decomposition

We will ignore the loop-carried dependence and go ahead with a straight-forward loop decomposition

Allow updates to all points in parallel
This is yet another different update order and may affect convergence
Update to a point may or may not see the new updates to the nearest neighbors (this parallel algorithm is non-deterministic)

while (!done)
   diff = 0.0;
   for_all i = 0 to n-1
      for_all j = 0 to n-1
         temp = A[i, j];
         A[i, j] = 0.2(A[i, j]+A[i, j+1]+A[i, j-1]+A[i-1, j]+A[i+1, j]);
         diff += fabs (A[i, j] – temp);
      end for_all
   end for_all
   if (diff/(n*n) < TOL) then done = 1;
end while

Offers concurrency across elements: degree of concurrency is n²
Make the j loop sequential to have row-wise decomposition: degree n concurrency