|
Dynamic task queues
- Introduced in the last lecture
- Normally implemented as part of the parallel program
- Two possible designs
- Centralized task queue: a single queue of tasks; may lead to heavy contention because insertion and deletion to/from the queue must be critical sections
- Distributed task queues: one queue per processor
- Issue with distributed task queues
- When a queue of a particular processor is empty what does it do? Task stealing
Task stealing
- A processor may choose to steal tasks from another processor’s queue if the former’s queue is empty
- How many tasks to steal? Whom to steal from?
- The biggest question: how to detect termination? Really a distributed consensus!
- Task stealing, in general, may increase overhead and communication, but a smart design may lead to excellent load balance (normally hard to design efficiently)
- This is a form of a more general technique called Receiver Initiated Diffusion (RID) where the receiver of the task initiates the task transfer
- In Sender Initiated Diffusion (SID) a processor may choose to insert into another processor’s queue if the former’s task queue is full above a threshold
Architect’s job
- Normally load balancing is a responsibility of the programmer
- However, an architecture may provide efficient primitives to implement task queues and task stealing
- For example, the task queue may be allocated in a special shared memory segment, accesses to which may be optimized by special hardware in the memory controller
- But this may expose some of the architectural features to the programmer
- There are multiprocessors that provide efficient implementations for certain synchronization primitives; this may improve load balance
- Sophisticated hardware tricks are possible: dynamic load monitoring and favoring slow threads dynamicall
Partitioning and communication
- Need to reduce inherent communication
- This is the part of communication determined by assignment of tasks
- There may be other communication traffic also (more later)
- Goal is to assign tasks such that accessed data are mostly local to a process
- Ideally I do not want any communication
- But in life sometimes you need to talk to people to get some work done!
|