Module 2: "Parallel Computer Architecture: Today and Tomorrow"
  Lecture 3: "Evaluating Performance"
 

Why parallel arch.?

  • Parallelism helps
  • There are applications that can be parallelized easily
  • There are important applications that require enormous amount of computation (10 GFLOPS to 1 TFLOPS)
    • NASA taps SGI, Intel for Supercomputers: 20 512p SGI Altix using Itanium 2 (http://zdnet.com.com/2100-1103_2-5286156.html) [27th July, 2004]
  • There are important applications that need to deliver high throughput

Why study it?

  • Parallelism is ubiquitous
    • Need to understand the design trade-offs
    • Microprocessors are now multiprocessors (more later)
    • Today a computer architect’s primary job is to find out how to efficiently extract parallelism
  • Get involved in interesting research projects
    • Make an impact
    • Shape the future development
    • Have fun

Performance metrics

  • Need benchmark applications
    • SPLASH (Stanford ParalleL Applications for SHared memory)
    • SPEC (Standard Performance Evaluation Corp.) OMP
    • ScaLAPACK (Scalable Linear Algebra PACKage) for message-passing machines
    • TPC (Transaction Processing Performance Council) for database/transaction processing performance
    • NAS (Numerical Aerodynamic Simulation) for aerophysics applications
      • NPB2 port to MPI for message-passing only
    • PARKBENCH (PARallel Kernels and BENCHmarks) for message-passing only 
  • Comparing two different parallel computers
    • Execution time is the most reliable metric
    • Sometimes MFLOPS, GFLOPS, TFLOPS are used, but could be misleading
  • Evaluating a particular machine
    • Use speedup to gauge scalability of the machine (provided the application itself scales)
    • Speedup(P) = Uniprocessor time/Time on P processors
    • Normally the input data set is kept constant when measuring speedup