Copy Propagation What does it mean Given an

  • Slides: 23
Download presentation
Copy Propagation • What does it mean? – Given an assignment x = y,

Copy Propagation • What does it mean? – Given an assignment x = y, replace later uses of x with uses of y, provided there are no intervening assignments to x or y. • Similar to register coalescing, which eliminates copies from one register to another. • When is it performed? – At any level, but usually early in the optimization process. • What is the result? – Smaller code 1

Copy Propagation • Local copy propagation – Performed within basic blocks – Algorithm sketch:

Copy Propagation • Local copy propagation – Performed within basic blocks – Algorithm sketch: • traverse BB from top to bottom • maintain table of copies encountered so far • modify applicable instructions as you go 2

Copy Propagation • Algorithm sketch for a basic block containing instructions i 1, i

Copy Propagation • Algorithm sketch for a basic block containing instructions i 1, i 2, . . . , in for instr = i 1 to in if instr is of the form 'res = opd 1 op opd 2' flag = replace(opd 1, instr) || replace(opd 2, instr); /* no short-circuit */ if instr is of the form 'res = var' flag = replace(var, instr); /* replaces var with var 2 */ if flag is true /* then we need to update the table */ if instr is of the form 'res = opd 1 op opd 2' if the table contains any pairs involving res, remove them if instr is of the form 'res = var' insert {(res, var 2)} in the table endfor replace(opd, instr) if you find (opd, x) in table /* use hashing for faster access */ replace the use of opd in instr with x return true return false 3

Copy Propagation Example: Local copy propagation on basic block: step instruction 1 2 3

Copy Propagation Example: Local copy propagation on basic block: step instruction 1 2 3 4 5 b=a c=b+1 d=b b=d+c b=d updated instruction b=a c=a+1 d=a b=a+c b=a c=b+1 d=b b=d+c b=d table contents {(b, a)} {(b, a), (d, a)} {(d, a), (b, a)} Note: if there was a definition of 'a' between 3 and 4, then we would have to remove (b, a) and (d, a) from the table. As a result, we wouldn't be able to perform local copy propagation at instructions 4 and 5. However, this will be taken care of when we perform global copy propagation. 4

Copy Propagation • Global copy propagation – – Performed on flow graph. Given copy

Copy Propagation • Global copy propagation – – Performed on flow graph. Given copy statement x=y and use w=x, we can replace w=x with w=y only if the following conditions are met: • x=y must be the only definition of x reaching w=x – • This can be determined through ud-chains there may be no definitions of y on any path from x=y to w=x. – Use iterative data flow analysis to solve this. » Even, better, use iterative data flow analysis to solve both problems at the same time. 5

Copy Propagation • Data flow analysis to determine which instructions are candidates for global

Copy Propagation • Data flow analysis to determine which instructions are candidates for global copy propagation – forward direction – gen[Bi] = {(x, y, i, p) | p is the position of x=y in block Bi and neither x nor y is assigned a value after p} – kill[Bi] = {(x, y, j, p) | x=y, located at position p in block Bj Bi, is killed due to a definition of x or y in Bi } – in[B]= out[P] where P is a predecessor – Initialize in[B 1]= , in[B]=U for B B 1 6

Copy Propagation entry dead code? c=a+b d=c e=d*d f=a+c g=e a=g+d a<c f=a+c g=e

Copy Propagation entry dead code? c=a+b d=c e=d*d f=a+c g=e a=g+d a<c f=a+c g=e a=e+c a<c f=d-g f>a h=g+1 b=g*a h<f exit c=a+b d=c e=c*c f=c-e f>a h=e+1 b=e*a h<f exit 7

Copy Propagation • Copy propagation will not detect the opportunity to replace x with

Copy Propagation • Copy propagation will not detect the opportunity to replace x with y in the last block below: z >0 x=y w=x+z Mini quiz: which optimization can handle this? Answer: If we perform an optimization similar to code hoisting (i. e. one that would move the copy either up or down the graph) then copy propagation will be able to update "w=y+z" • Copy propagation may generate code that does not need to be evaluated any longer. – This will be handled by optimizations that perform redundancy elimination. 8

Constant Propagation • What does it mean? – Given an assignment x = c,

Constant Propagation • What does it mean? – Given an assignment x = c, where c is a constant, replace later uses of x with uses of c, provided there are no intervening assignments to x. • Similar to copy propagation • Extra feature: It can analyze constant-value conditionals to determine whether a branch should be executed or not. • When is it performed? – Early in the optimization process. • What is the result? – Smaller code – Fewer registers 9

Redundancy Elimination • Several optimizations deal with locating and appropriately eliminating redundant calculations. •

Redundancy Elimination • Several optimizations deal with locating and appropriately eliminating redundant calculations. • These optimizations require data flow analysis • They include – – common subexpression elimination loop-invariant code motion partial-redundancy elimination code hoisting 10

Common Subexpression Elimination • Local common subexpression elimination – Performed within basic blocks –

Common Subexpression Elimination • Local common subexpression elimination – Performed within basic blocks – Algorithm sketch: • traverse BB from top to bottom • maintain table of expressions evaluated so far – if any operand of the expression is redefined, remove it from the table • modify applicable instructions as you go – generate temporary variable, store the expression in it and use the variable next time the expression is encountered. t=a+b x=t. . . y=a+b 11 y=t

Common Subexpression Elimination c=a+b d=m*n e=b+d f=a+b g=-b h=b+a a=j+a k=m*n j=b+d a=-b if

Common Subexpression Elimination c=a+b d=m*n e=b+d f=a+b g=-b h=b+a a=j+a k=m*n j=b+d a=-b if m * n go to L the table contains quintuples: (pos, opd 1, opr, opd 2, tmp) t 1 = a + b c = t 1 t 2 = m * n d = t 2 t 3 = b + d e = t 3 f = t 1 g = -b h = t 1 /* commutative */ a=j+a k = t 2 j = t 3 a = -b if t 2 go to L 12

Common Subexpression Elimination • Global common subexpression elimination – Performed on flow graph –

Common Subexpression Elimination • Global common subexpression elimination – Performed on flow graph – Requires available expression information • In addition to finding what expressions are available at the endpoints of basic blocks, we need to know where each of those expressions was most recently evaluated (which block and which position within that block). 13

Common Subexpression Elimination • Global common subexpression elimination – Algorithm sketch: For each block

Common Subexpression Elimination • Global common subexpression elimination – Algorithm sketch: For each block B and each statement x=y+z, s. t. {y+z} in[B] i. iii. iv. – Find the evaluation of y+z that reaches B, say w=x+y Create temporary variable t Replace [w=y+z] with [t=y+z; w=t] Replace [x=y+z] with [x=t] Notes: • This method will miss the fact that b and d have the same value: a = x+y Mini quiz: which optimization c = x+y can handle this? b = a *z Answer: Value Numbering 14 d = c *z

Common Subexpression Elimination entry c=a+b d=a*c e=d*d t 1 = a + b c

Common Subexpression Elimination entry c=a+b d=a*c e=d*d t 1 = a + b c = t 1 d=a*c t 2 = d * d e = t 2 f=a+b c=c*2 c>d f = t 1 c=c*2 c>d g=a*c g=d*d g > 10 exit g=a*c g = t 2 g > 10 exit 15

Loop-Invariant Code Motion • What does it mean? – Computations that are performed in

Loop-Invariant Code Motion • What does it mean? – Computations that are performed in a loop and have the same value at every iteration are moved outside the loop. • Before we go on: What is a loop? – A set of basic blocks with • a single entry point called the header, which dominates all the other blocks in the set and • at least one way to iterate (i. e. go back to the header) – Block Bi dominates block Bj if every path from the flow graph entry to Bj goes through Bi – A loop can be identified by finding an flow graph edge Bj Bi (called a back edge) s. t. Bi dominates Bj and then finding all blocks that can reach Bj without going through Bi 16

(Loops) entry B 1 The dominator tree shows the dominator relation: each node in

(Loops) entry B 1 The dominator tree shows the dominator relation: each node in the tree is the immediate dominator of its children. Example: B 7 is dominated by B 1, B 3, and B 4, but its immediate (closest) dominator is B 4 Note: B 5 does not dominate B 7 because we can go from the entry to B 7 through the B 6 path. B 2 B 3 B 4 B 5 B 1 B 6 B 2 B 3 B 4 B 7 B 5 B 8 B 9 B 6 B 7 B 8 B 10 exit B 9 B 10 17

(Loops) entry B 1 back edge: B 9 B 1 loop: {B 9, B

(Loops) entry B 1 back edge: B 9 B 1 loop: {B 9, B 8, B 7, B 10, B 6, B 5, B 4, B 3, B 2, B 1} B 2 B 3 back edge: B 10 B 7 loop: {B 10, B 8, B 7} B 4 B 5 B 6 back edge: B 7 B 4 loop: {B 7, B 10, B 6, B 5, B 8, B 4} B 7 B 8 B 9 back edge: B 8 B 3 loop: {B 8, B 7, B 10, B 6, B 5, B 4, B 3 } B 10 back edge: B 4 B 3 loop: {B 4, B 7, B 10, B 8, B 6, B 5, B 3} 18 exit

Loop-Invariant Code Motion • How do we identify loop-invariant computations? – – Easy: use

Loop-Invariant Code Motion • How do we identify loop-invariant computations? – – Easy: use ud-chains But also: • • • If an computation i depends on a loop-invariant computation j, then i is also loop-invariant. This gives rise to an inductive definition of loop-invariant computations An instruction is loop-invariant if, for each operand: 1. The operand is constant, OR 2. All definitions of that operand that reach the instruction are outside the loop, OR 3. There is exactly one in-loop definition of the operand that reaches the instruction, and that definition is loop invariant 19

Loop-Invariant Code Motion • Algorithm sketch: 1. 2. 3. • – Find all loop-invariant

Loop-Invariant Code Motion • Algorithm sketch: 1. 2. 3. • – Find all loop-invariant instructions For each instruction i: x=y+z found in step 1, check i. that its block dominates all exits of the loop ii. that x is not defined anywhere else in the loop iii. that all uses of x in the loop can be reached only by i (i. e. its block dominates all uses of x) Move each instruction i that satisfies the requirements in step 2 to a newly created pre-header of the loop, making certain that any operands (such as y, z) have already had their definitions moved to the pre-header. Note: When applying loop-invariant code motion to nested loops, work from the innermost loop outwards. 20

Loop-Invariant Code Motion entry b=2 i=1 Mini quiz: What happens if you perform constant

Loop-Invariant Code Motion entry b=2 i=1 Mini quiz: What happens if you perform constant propagation followed by constant folding after the loop-invariant code motion in this loop? a = b+1 c=2 i mod 2 = 0 d=a+d e=1+d F d = -c f=1+a i = i+1 a<2 T exit 21

Loop-Invariant Code Motion entry b=2 i=1 a = b+1 c=2 t 1 = a<2

Loop-Invariant Code Motion entry b=2 i=1 a = b+1 c=2 t 1 = a<2 a = b+1 c=2 i mod 2 = 0 d=a+d e=1+d F d = -c f=1+a i = i+1 a<2 T exit i mod 2 = 0 d=a+d e=1+d F d = -c f=1+a i = i+1 t 1 T exit 22

entry b=2 i=1 a = b+1 c=2 t 1 = a<2 i mod 2

entry b=2 i=1 a = b+1 c=2 t 1 = a<2 i mod 2 = 0 d=a+d e=1+d F d = -c f=1+a b=2 i=1 a=3 c=2 t 1 = false after constant propagation and constant folding i mod 2 = 0 d=a+d e=1+d i = i+1 t 1 T exit F d = -c f=1+a i = i+1 t 1 exit 23