Module 6: "Fundamentals of Parallel Computers"
  Lecture 11: "Design Issues in Parallel Computers"
 

Dataflow architecture

  • Express the program as a dataflow graph
  • Logical processor at each node is activated when both operands are available
    • Mapping of logical nodes to PEs is specified by the program
  • On finishing an operation, a message or token is sent to the destination processor
  • Arriving tokens are matched against a token store and a match triggers the operation

Systolic arrays

  • Replace the pipeline within a sequential processor by an array of PEs
  • Each PE may have small instruction and data memory and may carry out a different operation
  • Data proceeds through the array at regular “heartbeats” (hence the name)
  • The dataflow may be multi-directional or optimized for specific algorithms
    • Optimize the interconnect for specific application (not necessarily a linear topology)
  • Practical implementation in iWARP
    • Uses general purpose processors as PEs
    • Dedicated channels between PEs for direct register to register communication

A generic architecture

  • In all the architectures we have discussed thus far a node essentially contains processor(s) + caches, memory and a communication assist (CA)
    • CA = network interface (NI) + communication controller
  • The nodes are connected over a scalable network
  • The main difference remains in the architecture of the CA
    • And even under a particular programming model (e.g., shared memory) there is a lot of choices in the design of the CA
    • Most innovations in parallel architecture take place in the communication assist (also called communication controller or node controller)