Objectives_template

Module 7: "Parallel Programming"

Lecture 13: "Parallelizing a Sequential Program"

Message passing

This algorithm is deterministic
May converge to a different solution compared to the shared memory version if there are multiple solutions: why?
- There is a fixed specific point in the program (at the beginning of each iteration) when the neighboring rows are communicated
- This is not true for shared memory

Message Passing Grid Solver

MPI-like environment

MPI stands for Message Passing Interface
- A C library that provides a set of message passing primitives (e.g., send, receive, broadcast etc.) to the user
PVM (Parallel Virtual Machine) is another well-known platform for message passing programming
Background in MPI is not necessary for understanding this lecture
Only need to know
- When you start an MPI program every thread runs the same main function
- We will assume that we pin one thread to one processor just as we did in shared memory

Instead of using the exact MPI syntax we will use some macros that call the MPI functions

MAIN_ENV;
/* define message tags */
#define ROW 99
#define DIFF 98
#define DONE 97
int main(int argc, char **argv)
{
   int pid, P, done, i, j, N;
   float tempdiff, local_diff, temp, **A;
   MAIN_INITENV;
   GET_PID(pid);
   GET_NUMPROCS(P);
   N = atoi(argv[1]);
   tempdiff = 0.0;
   done = 0;
   A = (double **) malloc ((N/P+2) * sizeof(float *));
   for (i=0; i < N/P+2; i++) {
       A[i] = (float *) malloc (sizeof(float) * (N+2));
   }
   initialize(A);
while (!done) {
    local_diff = 0.0;
   /* MPI_CHAR means raw byte format */
    if (pid) { /* send my first row up */
       SEND(&A[1][1], N*sizeof(float), MPI_CHAR, pid-1, ROW);
    }
    if (pid != P-1) { /* recv last row */
       RECV(&A[N/P+1][1], N*sizeof(float), MPI_CHAR, pid+1, ROW);
    }
    if (pid != P-1) { /* send last row down */
       SEND(&A[N/P][1], N*sizeof(float), MPI_CHAR, pid+1, ROW);
    }
    if (pid) { /* recv first row from above */
       RECV(&A[0][1], N*sizeof(float), MPI_CHAR, pid-1, ROW);
    }
    for (i=1; i <= N/P; i++) for (j=1; j <= N; j++) {
          temp = A[i][j];
          A[i][j] = 0.2 * (A[i][j] + A[i][j-1] +            A[i-1][j] + A[i][j+1] + A[i+1][j]);
          local_diff += fabs(A[i][j] - temp);
         }
if (pid) { /* tell P0 my diff */
      SEND(&local_diff, sizeof(float),    MPI_CHAR, 0, DIFF);
       RECV(&done, sizeof(int), MPI_CHAR, 0, DONE);
   }
   else { /* recv from all and add up */
      for (i=1; i < P; i++) {
         RECV(&tempdiff, sizeof(float), MPI_CHAR, MPI_ANY_SOURCE, DIFF);
         local_diff += tempdiff;
      }
      if (local_diff/(N*N) < TOL) done=1;
      for (i=1; i < P; i++) {
         /* tell all if done */
         SEND(&done, sizeof(int), MPI_CHAR, i, DONE);
      }
   }
} /* end while */
MAIN_END;
} /* end main */

Note the matching tags in SEND and RECV
Macros used in this program
- GET_PID
- GET_NUMPROCS
- SEND
- RECV
These will get expanded into specific MPI library calls
Syntax of SEND/RECV
- Starting address, how many elements, type of each element (we have used byte only), source/dest, message tag