Performance

Why aren’t parallel applications scalable?

  • sequential performance
  • critical paths (dependencies between computations spread across processors)
  • bottlenecks (one processor holds things up)
  • algorithmic overhead (some things just take more effort to do in parallel)
  • communication overhead (spending increasing proportion of time on communication)
  • load imbalance (makes all processors wait for the slowest one, dynamic behaviour)
  • speculative loss (do A and B in parallel, but B is ultimately not needed wasted work from speculative execution)

What is the maximum parallelism possible?

  • depends on application, algorithm, program data dependencies in execution
  • MaxPar:
    • analyzes the earliest possible “time” any data can be computed
    • assumes a simple model for time it takes to execute instruction or go to memory
    • result is the maximum parallelism available

Embarrassingly parallel computations

  • no or very little communication between processes
  • each process can do its tasks without any interaction with other processes

Calculating with Monte Carlo

  • place a circle inside a square box with side of (2cm)
  • the ratio of the circle area to the square area is
  • randomly choose a number of points in the square for each point , determine if is inside the circle
  • the ratio of points in the circle to points in the square will give approximation of

Analytical / theoretical techniques

  • involves simple algebraic formulas and ratios
    • typical variables: data size (), number of processors (), machine constants
    • to model performance of individual operations, components, algorithms in terms of the above
      • be careful to characterize variations across processors
      • model them with max operators
    • constants are important in practice
      • use asymptotic analysis carefully

Isoefficiency

  • goal is to quantify scalability
  • how much increase in problem size is needed to retain the same efficiency on a larger machine?
  • efficiency:
  • isoefficiency:
    • equation for equal-efficiency curves
    • if no solution problem is not scalable in the sense defined by isoefficiency
    • how the problem size () must increase with the number of processors () to keep the efficiency () constant

Scalability of adding numbers

  • scalability of a parallel system is a measure of its capacity to increase speedup with more processors
  • adding numbers ( sums) on processors with strip partition

Problem size and overhead

  • informally, problem size is expressed as a parameter of the input size
  • a consistent definition of the size of the problem is the total number of basic operations () also refere to problem size as work ()
  • overhead of a parallel system () is defined as the part of the cost not in the best serial algorithm
  • denoted by , it is a function of and
    • ( includes overhead)

Isoefficiency function

  • with fixed efficiency, as a function of

Isoefficiency function of adding numbers

  • overhead function:
  • isoefficiency function:
  • if doubles, needs to also be doubled to roughly maintain the same efficiency
  • isoefficiency functions can be more difficult to express for more complex algorithms

More complex isoefficiency functions

  • a typical overhead function can have several distinct terms of different orders of magnitude with respect to both and
  • we can balance against each term of and compute the respective isoefficiency functions for individual terms
    • keep only the term that requires the highest grow rate with respect to

my notes end at slide 20

take notes on rest of slides

l4b.pdf

finished