|
Loop Fusion
- When two adjacent countable loops have the same loop limits they can sometimes be fused
- Reduces cost of test and branch
- Fusing loops which refer to the same data enhances temporal locality
- It has significant impact on cache and virtual memory performance
- Loop fusion may increase size of the loop which can reduce instruction locality (noticeable with very small cache memories)
- Fusion is legal if all the dependence relations are preserved
- Before fusion all relations must flow from body1 to body2 (unless carried by an outer loop)
For I = 1,n
A[i]=B[i]+1
Endfor
For I = 1,n
C[i]=A[i]/2
Endfor
For I = 1,n
D[i]=1/C[i+1]
Endfor
S2 S5
S5 S8 |
For I = 1,n
A[i]=B[i]+1
C[i]=A[i]/2
D[i]=1/C[i+1]
Endfor
after fusion
the second
dependence
is violated |
For I = 1,n
A[i]=B[i]+1
C[i]=A[i]/2
Endfor
For I = 1,n
D[i]=1/C[i+1]
Endfor |
|
|
|
for I = 1,99
A[i]=B[i]+1
Endfor
for I = 1,98
C[i]=A[i+1]*2
Endfor |
A[1]=B[1]+1
for I = 2,99
A[i]=B[i]+1
Endfor
for I = 1,98
C[i]=A[i+1]*
2
Endfor |
A[1]=B[1]+1
for j = 0,97
A[j+2]=B[j+2]+1
C[j+1]=A[j+2]*2
Endfor |
Loop Fission
- A single loop may be broken into smaller loops (inverse of loop fusion)
- Used on machines which have very small instruction cache
- Improves memory locality
- Construct a statement level dependence graph of the body of the loop
- Dependence relations carried by outer loop need not be preserved
- Inner loops are treated as single nodes
- If there are no cycles then loop fission can divide the loop into separate loops around each node
- The loops are ordered in topological order of the dependence graph
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|