|
Why parallel arch.?
- Parallelism helps
- There are applications that can be parallelized easily
- There are important applications that require enormous amount of computation (10 GFLOPS to 1 TFLOPS)
- NASA taps SGI, Intel for Supercomputers: 20 512p SGI Altix using Itanium 2 (http://zdnet.com.com/2100-1103_2-5286156.html) [27th July, 2004]
- There are important applications that need to deliver high throughput
Why study it?
- Parallelism is ubiquitous
- Need to understand the design trade-offs
- Microprocessors are now multiprocessors (more later)
- Today a computer architect’s primary job is to find out how to efficiently extract parallelism
- Get involved in interesting research projects
- Make an impact
- Shape the future development
- Have fun
Performance metrics
- Need benchmark applications
- SPLASH (Stanford ParalleL Applications for SHared memory)
- SPEC (Standard Performance Evaluation Corp.) OMP
- ScaLAPACK (Scalable Linear Algebra PACKage) for message-passing machines
- TPC (Transaction Processing Performance Council) for database/transaction processing performance
- NAS (Numerical Aerodynamic Simulation) for aerophysics applications
- NPB2 port to MPI for message-passing only
- PARKBENCH (PARallel Kernels and BENCHmarks) for message-passing only
- Comparing two different parallel computers
- Execution time is the most reliable metric
- Sometimes MFLOPS, GFLOPS, TFLOPS are used, but could be misleading
- Evaluating a particular machine
- Use speedup to gauge scalability of the machine (provided the application itself scales)
- Speedup(P) = Uniprocessor time/Time on P processors
- Normally the input data set is kept constant when measuring speedup
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|