|
- Values written during one iteration of the outer loop is read during some iteration of outer loop
- Dependence disappears if outer loop is kept fixed and inner loop runs free
- Therefore, the dependence is carried by the outer loop and label the dependence with 1
- Parallel code:
for I=1,N
X[I,1..M]=X[I-1,1..M]
endfor
- Example
for I=1,100
for J=1,100
X[I,J]=X[I,J-1]
endfor
endfor
|
|
- dependence is carried by the inner loop
- Parallelization requires loop interchange
|
Loop Interchange
- The most important loop restructuring transformation
- It was developed for automatic parallelization of loops
- If a loop carried all the dependence then it would be brought to the outermost position
- Rest of the loops which did not carry dependence will be executed in parallel
- It can be done if there are no dependence cycles
- If outer loop iterates many times and inner loop only a few times then loop startup overhead is high
- Interchanging loops will improve performance
- Interchange can lead to spatial locality (remember matrix multiplication?)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|