Module 3: Fundamentals of Parallel Computers: ILP vs TLP
  Lecture 6: Preliminaries of Parallel Programming
 


Orchestration

  • Involves structuring communication and synchronization among processes, organizing data structures to improve locality, and scheduling tasks
    • This step normally depends on the programming model and the underlying architecture
  • Goal is to
    • Reduce communication and synchronization costs
    • Maximize locality of data reference
    • Schedule tasks to maximize concurrency: do not schedule dependent tasks in parallel
    • Reduce overhead of parallelization and concurrency management (e.g., management of the task queue, overhead of initiating a task etc.)

Mapping

  • At this point you have a parallel program
    • Just need to decide which and how many processes go to each processor of the parallel machine
  • Could be specified by the program
    • Pin particular processes to a particular processor for the whole life of the program; the processes cannot migrate to other processors
  • Could be controlled entirely by the OS
    • Schedule processes on idle processors
    • Various scheduling algorithms are possible e.g., round robin: process#k goes to processor#k
    • NUMA-aware OS normally takes into account multiprocessor-specific metrics in scheduling
  • How many processes per processor? Most common is one-to-one