Objectives_template

Tree Barrier

TreeBarrier ( pid , P) {
unsigned int i , mask;
for ( i = 0, mask = 1; (mask &
pid ) != 0; ++ i , mask <<= 1) {
while (!flag[ pid ][ i ]);
flag[ pid ][ i ] = 0;
}
if ( pid < (P - 1)) {
flag[ pid + mask][ i ] = 1;
while (!flag[ pid ][MAX- 1]);
flag[ pid ][MAX - 1] = 0;
}
for (mask >>= 1; mask > 0; mask >>= 1) {
flag[ pid - mask][MAX-1] = 1;
}

Convince yourself that this works
Take 8 processors and arrange them on leaves of a tree of depth 3
You will find that only odd nodes move up at every level during acquire (implemented in the first for loop)
The even nodes just set the flags (the first statement in the if condition): they bail out of the first loop with mask=1
The release is initiated by the last processor in the last for loop; only odd nodes execute this loop (7 wakes up 3, 5, 6; 5 wakes up 4; 3 wakes up 1, 2; 1 wakes up 0)

Each processor will need at most log (P) + 1 flags
Avoid false sharing: allocate each processor's flags on a separate chunk of cache lines
With some memory wastage (possibly worth it) allocate each processor's flags on a separate page and map that page locally in that processor's physical memory
- Avoid remote misses in DSM multiprocessor
- Does not matter in bus-based SMPs