Dependency Testing Recap Iteration vector Distance vector Direction

  • Slides: 18
Download presentation
Dependency Testing

Dependency Testing

Recap • Iteration vector • Distance vector • Direction vector • Loop carried dependencies

Recap • Iteration vector • Distance vector • Direction vector • Loop carried dependencies • Loop independent dependencies

General Condition for Loop Dependency Let α and β be iteration vectors within the

General Condition for Loop Dependency Let α and β be iteration vectors within the iteration space of the following loop nest for(int i 1 = L 1; i 1<=U 1; i 1+=S 1) for(int i 2 = L 2; i 2<=U 2; i 2+=S 2). . . for(int in = Ln; in<=Un; in+=Sn) S 1: A[f 1(i 1, . . . , in)]. . . [fm(i 1, . . . , in)] =. . . S 2. . . = A[g 1(i 1, . . . , in)]. . . [gm(i 1, . . . , in)] }. . . } } A dependence exists from S 1 to S 2 if and only if there exist values of α and β such that (1) α is lexicographically less than or equal to β and (2) the following system of dependence equations is satisfied: fi(α) = gi(β) for all i, 1 ≤ i ≤ m

Problem: Dependence Testing Goal: prove that no dependence exists between given pairs of subscripted

Problem: Dependence Testing Goal: prove that no dependence exists between given pairs of subscripted references to the same array variable. * no solution for the equation: fi(α) = gi(β) for all i, 1 ≤ i ≤ m array subscripts are linear expressions of the loop index variables. That is, all subscript expressions are of the form: a 1*i 1 + a 2*i 2 +. . . + an*in + e where ik is the index for the loop at nesting level k; all ak, 1 ≤ k ≤ n, integer constants. e: loop-invariant (symbolic) expressions. Linear subscript: dependency testing <=> finding the solution of a linear equation * NP-complete.

Dependency Testing - There are several dependence based testing techniques - The techniques address

Dependency Testing - There are several dependence based testing techniques - The techniques address particular types of dependency patterns - Lamport, GCD, Banerjee, I-test, power test, omega test, delta test

GCD Test for(int i 1 = L 1; i 1<=U 1; i++) for(int i

GCD Test for(int i 1 = L 1; i 1<=U 1; i++) for(int i 2 = L 2; i 2<=U 2; i 2++). . . for(int in = Ln; in<=Un; in++) S 1 A[f(i 1, . . . , in)] =. . . ; S 2. . . = A[g(i 1, . . . , in)]; Assume f(x 1, x 2, . . . , xn) = a 0 + a 1 x 1 +. . . + anxn g(y 1, y 2, . . . , yn) = b 0 + b 1 y 1 +. . . + bnyn a 0 + a 1 x 1 +. . . + anxn = b 0 + b 1 y 1 +. . . + bnyn a 0 – b 0 + a 1 x 1 – b 1 y 1 +. . . + anxn – bnyn = 0 a 1 x 1 – b 1 y 1 +. . . + anxn - bnyn = b 0 – a 0 GCD Test. The equation has a solution if and only if gcd(a 1, . . . , an, b 1, . . . , bn) divides b 0 – a 0.

Example of GCD Test if a loop carried dependency exists between X[a*i + b]

Example of GCD Test if a loop carried dependency exists between X[a*i + b] and X[c*i + d] then GCD (c, a) must divide (d – b). for (int i = 1; i <= n; i++) S 1: a[2*i] = b[i] + c[i]; S 2: d[i] = a[2*i-1]; Are there i 1 and i 2 such that 1<=i 1<i 2<=n and 2*i 1 = 2*i 2 -1 ? equivalently 2*i 2 + (-2)*i 1 = 1 There is an integer solution if and only if gcd(2, -2) divides 1 This is not the case, so no dependence.

False Positive of GCD Test for (int i = L; i <= L+10; i++)

False Positive of GCD Test for (int i = L; i <= L+10; i++) S 1: a[i] = b[i] + c[i]; S 2: d[i] = a[i-100]; Are there i 1 and i 2 such that L<=i 1<i 2<=L+10 and i 1 = i 2 -100 ? equivalently i 2 - i 1 = 100 There is an integer solution if and only if gcd(1, -1) divides 100 Answer: true, so there is a dependence But there is no actual dependence: false positive!

Limitations of GCD Test Ignores loop bounds Does not provide distance or direction information

Limitations of GCD Test Ignores loop bounds Does not provide distance or direction information If GCD is 1, then the analysis is very conservative

Delta Test Source iteration of dependency: I Sink iteration of dependency: I + d

Delta Test Source iteration of dependency: I Sink iteration of dependency: I + d for(int i = 1; i<=N; i++) A[i + 1] = A[i] + B Valid dependency implies i+1 = i+d. It implies d = 1. Loop-carried dependence with distance vector (1) and direction vector (<)

Another Example for(int i = 1; i<=100; i++) for(int j=1; j<=100; j++) for(int k

Another Example for(int i = 1; i<=100; i++) for(int j=1; j<=100; j++) for(int k = 1, k<=100; k++) A[i+1][j][k] = A[i][j][k+1] + B; I+1 = I+ di; J= J+ dj; • Solutions: di = 1; dj = 0; K= K+ dk + 1 dk = -1 • Corresponding direction vector: (<, =, >)

Dependency and Parallelism It is valid to convert a sequential loop to a parallel

Dependency and Parallelism It is valid to convert a sequential loop to a parallel loop if the loop carries no dependence. Example: Loop independent dependence - Scalar expansion - Privatization

Scalar Expansion int a[N], b[N], t; for(int i=0; i<N; i++){ t = a[i]; a[i]

Scalar Expansion int a[N], b[N], t; for(int i=0; i<N; i++){ t = a[i]; a[i] = b[i]; b[i] = t; } int a[N], b[N], t[N]; for(int i=0; i<N; i++){ t[i] = a[i]; a[i] = b[i]; b[i] = t[i]; } Handles loop independent dependency Performance tradeoff: Temporary variable is converted to an array that result in memory accesses

Restrictions - The number of iterations of the loop must be countable; the loop

Restrictions - The number of iterations of the loop must be countable; the loop step size must be changed in the loop body. - The expanded scalar must have no upward exposed uses in the loop for(int i=0; i<N; i++){ print(t); t = a[i]; a[i] = b[i]; b[i] = t; } - When the scalar is live after the loop, we must move the correct array value into the scalar. - a variable is live if it is used later for(int i=0; i<N; i++){ t = a[i]; a[i] = b[i]; b[i] = t; } print(t); // a[N-1]

Privatization for(int i=0; i<N; i++){ t = a[i]; a[i] = b[i]; b[i] = t;

Privatization for(int i=0; i<N; i++){ t = a[i]; a[i] = b[i]; b[i] = t; } #pragma omp parallel for … for(int i=0; i<N; i++){ int t = a[i]; a[i] = b[i]; b[i] = t; } Create local copies of the variable t

Restrictions - The expanded scalar must have no upward exposed uses in the loop

Restrictions - The expanded scalar must have no upward exposed uses in the loop for(int i=0; i<N; i++){ print(t); t = a[i]; a[i] = b[i]; b[i] = t; } - When the scalar is live after the loop, we must move the correct array value into the scalar. for(int i=0; i<N; i++){ t = a[i]; a[i] = b[i]; b[i] = t; } print(t);

Summary: Dependency and Parallelism It is valid to convert a sequential loop to a

Summary: Dependency and Parallelism It is valid to convert a sequential loop to a parallel loop if the loop carries no dependence. Automatic parallelization is possible to some extent. Compilers perform automatic parallelization in certain cases. Additional transformations may be required to get parallelizable loops.

References § Chapter 3 (relevant parts for dependency testing techniques) § Chapter 6. 2

References § Chapter 3 (relevant parts for dependency testing techniques) § Chapter 6. 2 and 6. 3 (privatization and scalar expansion) Optimizing compilers for modern architectures a dependence-based approach by Randy Allen, Kennedy