Open 64 Loop Induction Variable Canonicalization Outline Motivation
Open 64 Loop Induction Variable Canonicalization
Outline • • Motivation Background: Open 64 Compilation Scheme Loop Induction Variable Canonicalization Project Tracing and WHIRL Specification Loops References 3/27/2008 Open 64 Copyright © 2008 - Juergen Ributzka. All rights reserved. 2
Motivation How to copy one array to another array? 3/27/2008 Open 64 Copyright © 2008 - Juergen Ributzka. All rights reserved. 3
Motivation for (int i = 0; i < SIZE; i++) { p[i] = q[i]; } int i = 0; while (i < SIZE) { p[i] = q[i]; i = i + 1; } One simple problem – many different solutions while (p <= &p[SIZE-1]) { *p++ = *q++; } 3/27/2008 Open 64 int i = 1; if (i <= SIZE) { do { p[i-1] = q[i-1]; } while (i++ <= SIZE); } Copyright © 2008 - Juergen Ributzka. All rights reserved. 4
Motivation Compiler prefer code which is easy to analyze: Compiler Optimization int i = 0; while (i < SIZE) { p[i] = q[i]; i = i + 1; } User want high performance code: while (p <= &p[SIZE-1]) { *p++ = *q++; } Compiler Transformation Open 64 3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 5
Motivation • Just one Induction Variable – starting at 0 – stride of 1 • Unified Loop representation int iv = 0; while (iv <= SIZE-1) { p[iv] = q[iv]; iv = iv + 1; } 3/27/2008 Open 64 Copyright © 2008 - Juergen Ributzka. All rights reserved. 6
Background: Open 64 Compilation Scheme C/C++ FE Fortran FE Preopt IVR LNO Preopt Java FE Front End Loop Nest Optimizer (optional) Global Optimizer WOPT CG 3/27/2008 Open 64 Code Generation Copyright © 2008 - Juergen Ributzka. All rights reserved. 7
Loop Induction Variable Canonicalization Step 1: Induction Variable Injection Step 2: Inserting φ’s and Identity Assignments Step 3: Renaming Step 4: Induction Variable Analysis and Processing • Step 5: Copy Propagation and Expression Simplification • Step 6: Dead Store Elimination • • 3/27/2008 Open 64 Copyright © 2008 - Juergen Ributzka. All rights reserved. 8
Step 1: Induction Variable Injection • At this point we only have DO and WHILE loops – GOTO statements have been transformed to WHILE loops • Loops are annotated with details of the high-level loop construct • Inject a unit-stride induction variable into – Non-unit-stride DO loops – All WHILE loops Before: p = &a[0]; while (p <= &a[99]) { *p = 0; p = p + 1; } 3/27/2008 Open 64 After: p = &a[0]; iv = 0; while (p <= &a[99]) { *p = 0; p = p + 1; iv = iv + 1; } Copyright © 2008 - Juergen Ributzka. All rights reserved. 9
Step 2: Inserting φ’s and Identity Assignments Before: After: p ← &a[0] iv ← 0 iv ← φ(iv, iv) p ← φ(p, p) p ≤ &a[99] ? Insert φ’s 3/27/2008 *p ← 0 p←p+4 iv ← iv + 1 … iv ← iv p←p Open 64 Copyright © 2008 - Juergen Ributzka. All rights reserved. 10
Step 3: Renaming Before: After: p ← &a[0] iv ← 0 iv ← φ(iv, iv) p ← φ(p, p) p ≤ &a[99] ? *p ← 0 p←p+4 iv ← iv + 1 iv ← iv p←p 3/27/2008 Rename variables p 1 ← &a[0] iv 1 ← 0 iv 2 ← φ(iv 1, iv 3) p 2 ← φ(p 1, p 3) p 2 ≤ &a[99] ? *p 2 ← 0 p 3 ← p 2 + 4 iv 3 ← iv 2 + 1 iv 4 ← iv 2 p 4 ← p 2 Open 64 Copyright © 2008 - Juergen Ributzka. All rights reserved. 11
Step 4: Induction Variable Analysis and Processing Process φ list at the beginning of the loop One operand must correspond to the initial value The other must be defined in the loop Initialize symbolic expression tree with this operand Recursively resolve variables in the expression tree which are not defined by a φ node, except both φ node operands are the same • All variables in the symbolic expression tree must be now loop invariant or a result of a φ • i 2 is an induction variable, if the expression tree is of the form i 2 ± <expr> where i 2 is a φ result. • • • 3/27/2008 Open 64 Copyright © 2008 - Juergen Ributzka. All rights reserved. 12
Step 4: Induction Variable Analysis and Processing Example: i 2 ← φ(i 1, i 3) j 2 ← φ(j 1, j 3) i 2 ≤ 100 ? j 3 ← i 2 + 3 … i 3 ← j 3 + 2 … • i 1 and j 1 are initial values Expression Tree: i 2 ← i 3 i 2 ← j 3 + 2 i 2 ← i 2 + 5 (found IV) j 2 ← j 3 j 2 ← i 2 + 3 (can’t resolve i ) 2 3/27/2008 Open 64 Copyright © 2008 - Juergen Ributzka. All rights reserved. 13
Step 4: Induction Variable Analysis and Processing Example: i 2 ← φ(i 1, i 5) i 2 ≤ 100 ? i 2 < x ? … i 3 ← i 2 + 1 … i 4 ← i 2 + 1 ? = i 5 ← φ(i 3, i 4) • i 1 is initial values Expression Tree: i 2 ← i 5 i 2 ← φ(i 3, i 4) i 2 ← i 3 i 2 ← i 2 + 1 (found IV) … 3/27/2008 Open 64 Copyright © 2008 - Juergen Ributzka. All rights reserved. 14
Step 4: Induction Variable Analysis and Processing • Select Primary Induction Variable • Compute Trip Count • Exit Values sexit ← sinit + <tripcount> x sstep • Define Secondary Induction Variables (s) with Primary Induction Variables (p) s ← sinit + (p – pinit) x sstep 3/27/2008 Open 64 Copyright © 2008 - Juergen Ributzka. All rights reserved. 15
Step 4: Induction Variable Analysis and Processing Before: After: p 1 ← &a[0] iv 1 ← 0 iv 2 ← φ(iv 1, iv 3) p 2 ← φ(p 1, p 3) p 2 ≤ &a[99] ? p 1 ← &a[0] iv 1 ← 0 iv 2 ← φ(iv 1, iv 3) p 2 ← &a[0]+(iv 2 -0)x 4 p 2 ≤ &a[99] ? Add exit values and replace φ’s *p 2 ← 0 p 3 ← p 2 + 4 iv 3 ← iv 2 + 1 iv 4 ← iv 2 p 4 ← p 2 3/27/2008 *p 2 ← 0 p 3 ← p 2 + 4 iv 3 ← iv 2 + 1 iv 4 ← 100 p 4 ← &a[100] Open 64 Copyright © 2008 - Juergen Ributzka. All rights reserved. 16
Step 5: Copy Propagation and Expression Simplification • Preorder Traversal of the Dominator Tree • If use of x 1 is defined by an assignment of the form x 1 ← <expr>, then substitute it by <expr> • Example: Before: After: x 1 ← i 1 + j 1 y 2 ← x 1 – y 1 x 2 ← y 2 + z 3 3/27/2008 Open 64 x 1 ← i 1 + j 1 y 2 ← i 1 + j 1 – y 1 x 2 ← i 1 + j 1 – y 1 + z 3 Copyright © 2008 - Juergen Ributzka. All rights reserved. 17
Step 5: Copy Propagation and Expression Simplification Before: iv 2 ← φ(iv 1, iv 3) p 2 ← &a[0]+(iv 2 -0)x 4 p 2 ≤ &a[99] ? 3/27/2008 After: p 1 ← &a[0] iv 1 ← 0 Copy Propagation p 1 ← &a[0] iv 1 ← 0 iv 2 ← φ(iv 1, iv 3) p 2 ← &a[0]+(iv 2 -0)x 4 (&a[0]+(iv 2 -0)x 4) ≤ &a[99] ? *p 2 ← 0 p 3 ← p 2 + 4 iv 3 ← iv 2 + 1 *(&a[0]+(iv 2 -0)x 4) ← 0 p 3 ← &a[0]+(iv 2 -0)x 4 + 4 iv 3 ← iv 2 + 1 iv 4 ← 100 p 4 ← &a[100] Open 64 Copyright © 2008 - Juergen Ributzka. All rights reserved. 18
Step 5: Copy Propagation and Expression Simplification Before: After: p 1 ← &a[0] iv 1 ← 0 iv 2 ← φ(iv 1, iv 3) p 2 ← &a[0]+(iv 2 -0)x 4 (&a[0]+(iv 2 -0)x 4) ≤ &a[99] ? p 1 ← &a[0] iv 1 ← 0 iv 2 ← φ(iv 1, iv 3) p 2 ← &a[iv 2] iv 2 ≤ 99 ? Simplification 3/27/2008 *(&a[0]+(iv 2 -0)x 4) ← 0 p 3 ← &a[0]+(iv 2 -0)x 4 + 4 iv 3 ← iv 2 + 1 *a[iv 2] ← 0 p 3 ← &a[iv 2] + 4 iv 3 ← iv 2 + 1 iv 4 ← 100 p 4 ← &a[100] Open 64 Copyright © 2008 - Juergen Ributzka. All rights reserved. 19
Step 6: Dead Store Elimination • Mark all statements dead, except – – I/O statements return statements procedure calls statements with side effects (e. g. changes memory) • Propagate liveness to the rest of the program – for each variable used in a live statement mark its defining statement alive – mark the conditional branch alive on which the statements depends • Remove statements which has not been marked alive 3/27/2008 Open 64 Copyright © 2008 - Juergen Ributzka. All rights reserved. 20
Step 6: Dead Store Elimination Before: After: p 1 ← &a[0] iv 1 ← 0 iv 2 ← φ(iv 1, iv 3) p 2 ← &a[iv 2] iv 2 ≤ 99 ? 3/27/2008 Dead Store Elimination p 1 ← &a[0] iv 1 ← 0 iv 2 ← φ(iv 1, iv 3) p 2 ← &a[iv 2] iv 2 ≤ 99 ? *a[iv 2] ← 0 p 3 ← &a[iv 2] + 4 iv 3 ← iv 2 + 1 iv 4 ← 100 p 4 ← &a[100] Open 64 Copyright © 2008 - Juergen Ributzka. All rights reserved. 21
Project/Homework • Given a loop, trace the intermediate representation (WHIRL) of the Open 64 compiler as explained in the next slides. Create a CFG for each trace and explain what changed between each trace. The behavior that will be exposed by your trace will differ in certain aspects to the one presented in this presentation since Open 64 has evolved over time. • Is the result optimal? • What could be improved? • Extra Credit: Explain how the behavior has changed. 3/27/2008 Open 64 Copyright © 2008 - Juergen Ributzka. All rights reserved. 22
Tracing and WHIRL Specification • After the Front End opencc -c -O 3 -show -keep loop 1. c ir_b 2 a loop 1. B > loop 1. t • After HSSA creation opencc -c -O 3 -Wb, -tt 25: 0 x 0100 PHASE: w=off filename. c (this will give you the trace before and after IVR) • After Induction Variable Recognition opencc -c -O 3 -Wb, -tt 25: 0 x 0100 PHASE: w=off filename. c (this will give you the trace before and after IVR) 3/27/2008 Open 64 Copyright © 2008 - Juergen Ributzka. All rights reserved. 23
Tracing and WHIRL Specification • After Copy Propagation opencc -c -O 3 -Wb, -tt 25: 0 x 0020 PHASE: w=off filename. c • After Boolean Simplification opencc -c -O 3 -Wb, -tt 26: 0 x 0004 PHASE: w=off filename. c • After Dead Code Elimination opencc -c -O 3 -Wb, -tt 25: 0 x 0080 PHASE: w=off filename. c • After each step you will find the trace in filename. t 3/27/2008 Open 64 Copyright © 2008 - Juergen Ributzka. All rights reserved. 24
Tracing and WHIRL Specification Example C-Code: int foo (int *p, int size) { int sum = 0; int i; for (i = 0; i < size; i++) { sum += p[i]; } return sum; } 3/27/2008 Open 64 Copyright © 2008 - Juergen Ributzka. All rights reserved. 25
Tracing and WHIRL Specification WHIRL: FUNC_ENTRY <1, 20, foo> IDNAME 0 <2, 1, p> IDNAME 0 <2, 2, size> BODY BLOCK END_BLOCK PRAGMA 0 120 <null-st> 0 (0 x 0) # PREAMBLE_END 3/27/2008 Open 64 Copyright © 2008 - Juergen Ributzka. All rights reserved. 26
Tracing and WHIRL Specification LOC 1 4 int sum = 0; I 4 INTCONST 0 (0 x 0) I 4 STID 0 <2, 3, sum> T<4, . predef_I 4, 4> LOC 1 5 int i; LOC 1 6 LOC 1 7 for (i=0; i<size; i++) { I 4 INTCONST 0 (0 x 0) I 4 STID 0 <2, 4, i> T<4, . predef_I 4, 4> WHILE_DO I 4 I 4 LDID 0 <2, 2, size> T<4, . predef_I 4, 4> I 4 I 4 LDID 0 <2, 4, i> T<4, . predef_I 4, 4> I 4 I 4 GT 3/27/2008 Open 64 Copyright © 2008 - Juergen Ributzka. All rights reserved. 27
Tracing and WHIRL Specification = BODY BLOCK LOC 1 8 sum += p[i]; U 8 U 8 LDID 0 <2, 1, p> T<28, anon_ptr. , 8> I 8 I 4 LDID 0 <2, 4, i> T<9, . predef_U 8, 8> U 8 I 8 CVT U 8 INTCONST 4 (0 x 4) U 8 MPY U 8 ADD I 4 I 4 ILOAD 0 T<4, . predef_I 4, 4> T<28, anon_ptr. , 8> I 4 I 4 LDID 0 <2, 3, sum> T<4, . predef_I 4, 4> I 4 ADD I 4 STID 0 <2, 3, sum> T<4, . predef_I 4, 4> LOC 1 7 LABEL L 1 0 I 4 I 4 LDID 0 <2, 4, i> T<4, . predef_I 4, 4> I 4 INTCONST 1 (0 x 1) I 4 ADD I 4 STID 0 <2, 4, i> T<4, . predef_I 4, 4> END_BLOCK sum + load sum + * p 4 convert i 3/27/2008 Open 64 Copyright © 2008 - Juergen Ributzka. All rights reserved. 28
Tracing and WHIRL Specification LOC 1 9 } LOC 1 10 LOC 1 11 return sum; I 4 I 4 LDID 0 <2, 3, sum> T<4, . predef_I 4, 4> I 4 RETURN_VAL END_BLOCK 3/27/2008 Open 64 Copyright © 2008 - Juergen Ributzka. All rights reserved. 29
Loop 1 int loop 1 (int *p, int size) { int i = 0; while (i < size) { i = i + 3; p[i] = 0; i = i + 1; } return 0; } 3/27/2008 Open 64 Copyright © 2008 - Juergen Ributzka. All rights reserved. 30
Loop 2 int loop 2 (int *p, int *q, int size) { int i; for (i=0; i != size; i++) { *p = *q; p = p + 2; q = q + 3; } return 0; } 3/27/2008 Open 64 Copyright © 2008 - Juergen Ributzka. All rights reserved. 31
Loop 3 int loop 3 (int *p, int *q, int size) { int i = 0; while (i < size) { int j = i + 1; p[j] = 0; i = j + 3; q[i] = 1; } return 0; } 3/27/2008 Open 64 Copyright © 2008 - Juergen Ributzka. All rights reserved. 32
Loop 4 int loop 4 (int *a, int size) { int *p = a; int *q = &a[size]; while (p != q) { *(++p) = 0; } return 0; } 3/27/2008 Open 64 Copyright © 2008 - Juergen Ributzka. All rights reserved. 33
Loop 5 int loop 5 (int *a, int size) { int i = 0; while (i++ < size) { a[i] = 0; } return 0; } 3/27/2008 Open 64 Copyright © 2008 - Juergen Ributzka. All rights reserved. 34
Loop 6 int loop 6 (int *a, int size, int t) { int i = 0; int sum = 0; while (i < size) { if (a[i] < t) { i = i + 1; continue; } sum += a[i]; i = i + 1; } return sum; } 3/27/2008 Open 64 Copyright © 2008 - Juergen Ributzka. All rights reserved. 35
Loop 7 int loop 7 (int *a, int size) { int i, j; int sum = 0; int k = 0; for (i = 0; i < size; i++) { for (j = 0; j < size; j++) { sum += a[k]; k = k + 1; } } return sum; } 3/27/2008 Open 64 Copyright © 2008 - Juergen Ributzka. All rights reserved. 36
Acknowledgments • Dr. Fred Chow (Path. Scale, LLC) • Dr. Handong Ye (CAPSL) 3/27/2008 Open 64 Copyright © 2008 - Juergen Ributzka. All rights reserved. 37
References • Shin-Ming Liu, Raymond Lo and Fred Chow, “Loop Induction Variable Canonicalization in Parallelizing Compilers” • WHIRL Intermediate Language Specification (http: //www. open 64. net/documentation/manuals. html) • How to Debug Open 64 (Open 64/doc/HOW-TO-DEBUG-OPEN 64) 3/27/2008 Open 64 Copyright © 2008 - Juergen Ributzka. All rights reserved. 38
- Slides: 38