Module 18: Loop Optimizations
  Lecture 35: Amdahl’s Law
 
  • Values written during one iteration of the outer loop is read during some iteration of outer loop
  • Dependence disappears if outer loop is kept fixed and inner loop runs free
  • Therefore, the dependence is carried by the outer loop and label the dependence with 1
  • Parallel code:
    for I=1,N
    X[I,1..M]=X[I-1,1..M]
    endfor
  • Example
    for I=1,100
    for J=1,100
    X[I,J]=X[I,J-1]
    endfor
    endfor
  • dependence is carried by the inner loop
  • Parallelization requires loop interchange

Loop Interchange

  • The most important loop restructuring transformation
  • It was developed for automatic parallelization of loops
  • If a loop carried all the dependence then it would be brought to the outermost position
  • Rest of the loops which did not carry dependence will be executed in parallel
  • It can be done if there are no dependence cycles
  • If outer loop iterates many times and inner loop only a few times then loop startup overhead is high
  • Interchanging loops will improve performance
  • Interchange can lead to spatial locality (remember matrix multiplication?)