Dataflow Frameworks Conclusion Dataflow Analysis Nondistributive Analysis Announcements
Dataflow Frameworks: Conclusion Dataflow Analysis: Non-distributive Analysis
Announcements n Go over Quiz 1 n Homework 1? Spring 21 CSCI 4450/6450, A Milanova 2
Outline of Today’s Class n Dataflow frameworks, conclusion n Lattices (last class) Transfer functions Worklist algorithm n MOP solution vs. MFP solution n Non-distributive analyses n n Constant propagation Points-to analysis (next time) Spring 21 CSCI 4450/6450, A Milanova 3
Dataflow Framework n Equations: in(j) = V out(i) i in pred(j) out(j) = fj(in(j)) where: n in(j), out(j) are elements of a property space n fj is the transfer function associated with node j n V is the merge operator Spring 21 CSCI 4450/6450, A Milanova 4
Dataflow Frameworks (cont. ) n The property space must be: 1. A lattice L, ≤ 2. L satisfies the Ascending Chain Condition Requires that all ascending chains are finite Spring 21 CSCI 4450/6450, A Milanova 5
Dataflow Frameworks (cont. ) n n The merge operator V must be the join of L In dataflow, L is often the lattice of the subsets over a finite set of dataflow facts D Choose universal set D (e. g. , all definitions) n Figure out if we have a a may or must problem n Set ordering operation ≤ n Since the merge operator must be the join of L, a may problem sets ≤ to subset and a must problem sets ≤ to superset n Spring 21 CSCI 4450/6450, A Milanova 6
Example: Reach Lattice n Property space is the lattice of the subsets where n n D is the set of all definitions in the program ≤ is the subset operation n Join is set union may problem , as needed for Reach, which is a Lattice has a 0 being {}, and a 1 being D Does the lattice satisfy the Ascending Chain Condition? Spring 21 CSCI 4450/6450, A Milanova 7
Reach Lattice D = all definitions: {(x, 1), (x, 4), (a, 3)} Poset is 2 D, ≤ is the subset relation 1 1. x=a*b 2. if y<=a*b {(x, 1), (x, 4)} {(x, 4), (a, 3)} {(x, 1), (a, 3)} 3. a=a+1 4. x=a*b {(x, 1)} {(x, 4)} {(a, 3)} 5. goto 2 Spring 21 CSCI 4450/6450, A Milanova {} 0 8
(Monotone) Dataflow Framework n A problem fits into the dataflow framework if its property space is a lattice L, ≤ that satisfies the Ascending Chain Condition n its merge operator V is the join of L and n its transfer function space F: L L is monotone n n Thus, we can make use of a generic solution procedure, known as the worklist algorithm or the maximal fixpoint algorithm or the fixpoint iteration algorithm 9
Transfer Functions n The transfer functions: f: L L. Formally, function space F is such that 1. 2. 3. 4. F contains all fj F contains the identity function id(x) = x F is closed under composition Each f is monotone Spring 21 CSCI 4450/6450, A Milanova 10
Monotonicity Property n n F: L L is monotone if and only if: (1) a, b in L, f in F then a ≤ b f(a) ≤ f(b) or (equivalently): (2) x, y in L, f in F then f(x) V f(y) ≤ f(x V y) Theorem: Definitions (1) and (2) are equivalent. n n Show that (1) implies (2) Show that (2) implies (1) Spring 21 CSCI 4450/6450, A Milanova 11
Distributivity Property n n n F: L L is distributive if and only if x, y in L, f in F then f(x V y) = f(x) V f(y) Every distributive function is also monotone but not the other way around Distributivity is a very nice property! Spring 21 CSCI 4450/6450, A Milanova 12
Monotonicity and Distributivity n Is classical Reach distributive? n Yes To show distributivity: For each j: ( ( X U Y ) ∩ pres(j) ) U gen(j) = n ( (X∩pres(j)) U gen(j) ) U ( (Y∩pres(j)) U gen(j) ) ( ( X U Y ) ∩ pres(j) ) U gen(j) = ( ( X ∩ pres(j) ) U ( Y ∩ pres(j) ) ) U gen(j) = ( (X ∩ pres(j)) U gen(j) ) U ( (Y ∩ pres(j)) U gen(j) ) 13
Monotone Dataflow Framework n A problem fits into the dataflow framework if its property space is a lattice L, ≤ that satisfies the Ascending Chain Condition n its merge operator V is the join of L and n its transfer function space F: L L is monotone n n Thus, we can make use of a generic solution procedure, known as the worklist algorithm or the maximal fixpoint algorithm or the fixpoint iteration algorithm 14
Worklist Algorithm for Forward Dataflow Problems /* Initialize to initial values; 1 is entry node of CFG */ in(1) = Initial. Value; in. Reach (1) = UNDEF for m = 2 to n do in(m) = 0 in. Reach (m) = {} W = {1, 2, …, n} /* put every node on the worklist */ while W ≠ Ø do { remove j from W out(j) = fj(in(j)) for i in successors(j) if out(j) ≤ in(i) then { in(i) = out(j) V in(i) W=WU{i} } } out. Reach(j) = (in. Reach(j)∩pres(j))Ugen(j) if out. Reach(j) in. Reach (i) in. Reach(i) = out. Reach(j) U in. Reach(i)
Spring 21 CSCI 4450/6450, A Milanova 16
Worklist Algorithm for Forward Dataflow Problems (slightly different) /* Initialize to initial values; 1 is entry node of CFG */ in(1) = Initial. Value; out(1) = f 1(in(1)) for m = 2 to n do in(m) = 0; out(m) = fm(0) W = {2, …, n} /* put every node but 1 on the worklist */ while W ≠ Ø do { remove j from W in(j) = V { out(i) | i is predecessor of j } out(j) = fj(in(j)) if out(j) changed then W = W U { k | k is successor of j } } 17
Example. Reach with Bitvectors (i, 1), (k, 1) i=0 k=0 B 1 B 2 B 3 B 4 B 5 B 6 pres: 00000 11111 10001 01110 gen: 11000 00000 00100 00010 00001 i<0 B 2 mod(i, 3) == 0 B 3 (k, 4) k=k-1 B 4 (k, 5) k=k+1 (i, 6) i=i+1 B 6 exit Bitvector: 0 0 0 B 5 (i, 1) (k, 4) (k, 5) (i, 6)
Initialization 00000 (i, 1), (k, 1) i=0 out(B 1) =11000 k=0 B 1 i<0 B 2 mod(i, 3) == 0 00000 (k, 4) k=k-1 00100 (i, 6) B 1 B 2 B 3 B 4 B 5 B 6 pres: 00000 11111 10001 01110 gen: 11000 00000 00100 00010 00001 in(B 2) = 00000 B 3 exit 00000 B 4(k, 5) k=k+1 B 5 00010 00000 i=i+1 B 6 00001
Iteration 00000 11000 i=0 k=0 B 1 i<0 B 2 mod(i, 3) == 0 11001 (k, 4) k=k-1 10101 (i, 6) B 1 B 2 B 3 B 4 B 5 B 6 pres: 00000 11111 10001 01110 gen: 11000 00000 00100 00010 00001 in(B 2) = 11001 B 3 exit 11001 B 4(k, 5) k=k+1 B 5 10011 10111 i=i+1 B 6 00111 W = { B 2, B 3, B 4, B 5, B 6 } W = { B 6 } W={} W = { B 2 }
Iteration 00000 11000 i=0 k=0 B 1 i<0 B 2 mod(i, 3) == 0 B 1 B 2 B 3 B 4 B 5 B 6 pres: 00000 11111 10001 01110 gen: 11000 00000 00100 00010 00001 in(B 2) = 11111 B 3 exit 11111 B 4 k=k-1 k=k+1 B 5 10101 10011 10111 i=i+1 B 6 00111 W = { B 2 } W = { B 3 } W = { B 4, B 5 } W = { B 5, B 6 } W = { B 6 } W={ }
Termination Argument n Why does the algorithm terminate? n Sketch of argument: n n n in(j), out(j) do not “shrink”: inn(j) ≤ inn+1(j) A node k is added to W only if some out(j) “changes up”: outn(j) < outn+1(j) Since out(j) in L, and L satisfies the Ascending Chain Condition, out(j) changes at most h times where h is the height of the lattice L Spring 21 CSCI 4450/6450, A Milanova 22
Correctness Argument n n n Theorem: The worklist algorithm computes a solution that satisfies the dataflow equations Why? Sketch of argument: n n n Assume algorithm terminates and there is a j such that in(j) = V { out(i) } does not hold Thus, there is a path p to j such that out(i) ≤ in(j) where i is the predecessor of j in p Now assume equations hold for predecessor of j and arrive at contradiction 23
Precision Argument n Theorem: The algorithm computes the least solution of the dataflow equations. n n Historically though, this solution is called the maximal fixpoint solution (MFP) I. e. , For every node j, the worklist algorithm computes a solution of the dataflow equations called the MFP(j) = {in(j), out(j)}. For every other solution we have in(j) ≤ in’(j), out(j) ≤ out’(j) for every node j Spring 21 CSCI 4450/6450, A Milanova 24
Example Solution 1 Solution 2 in. Avail(1) = Ø 1. z: =x+y out. Avail(1) = (in. Avail(1)-Ez) 2. if (z > 500) 3. skip {x+y} in. Avail(2) = out. Avail(1) V out. Avail(3) out. Avail(2) = in. Avail(2) in. Avail(3) = out. Avail(2) out. Avail(3) = in. Avail(3) Ø Ø {x+y} Ø Equivalent to: in. Avail(2) = {x+y} V in. Avail(2) and recall that V is ∩ (i. e. , set intersection). Spring 21 CSCI 4450/6450, A Milanova 25
Outline of Today’s Class n Dataflow frameworks, conclusion n Lattices (last class) Transfer functions Worklist algorithm n MOP solution vs. MFP solution n Non-distributive analyses n n Constant propagation Points-to analysis (next time) Spring 21 CSCI 4450/6450, A Milanova 26
Meet Over All Paths (MOP) 1 n 2 n 3 … n nk n Desired dataflow information at n is obtained by traversing ALL PATHS from 1 (entry node) to n. For every path p=(1, n 2, n 3. . . , nk) we compute fnk(…fn 2(f 1(init(1)))) The MOP at entry of n is V fnk(…fn 2(f 1(init(1)))) Spring 21 CSCI 4450/6450, A Milanova over ALL PATHS p from 1 to n 27
MOP vs. MFP n MOP is an abstraction of the best solution computable with dataflow analysis n n n It is a common assumption in dataflow analysis that all program paths are executable (Abstract interpretation and axiomatic semantics are more precise and rule out some infeasible paths) Recall that the MFP is the solution computed by the worklist algorithm Spring 21 CSCI 4450/6450, A Milanova 28
MOP vs. MFP n n For distributive problems MFP = MOP! Unfortunately, for monotone problems this is not true. But we still have a safe solution: it is a theorem that for monotone problems, MFP ≥ MOP Spring 21 CSCI 4450/6450, A Milanova 29
Safety of a Dataflow Solution n n A safe (also, correct or sound) solution X overestimates the “best” possible dataflow solution, i. e. , X ≥ MOP Historically, an acceptable solution X is one that is better than what we can do with the MFP, i. e. , X ≤ MFP Acceptable Safe MOP Spring 21 CSCI 4450/6450, A Milanova 0 30
Safe Solutions n In may problems, 1 is the universal set of facts, the merge operator is the set union n It is safe to err by saying that a fact reaches a node when in fact it doesn’t E. g. , in Reach it is safe to err by adding a spurious definition; it is unsafe to err by omitting a definition (x, k) that reaches a node Safe is “larger” than the MOP: MOP ≤ X. Since ≤ in Reach is subset, safer solutions end up being larger sets (which is natural) Spring 21 CSCI 4450/6450, A Milanova 31
Safe Solutions: Reach U = all definitions: {(x, 1), (x, 4), (a, 3)} Poset is 2 U, ≤ is the subset relation 1 1. x=a*b 2. if y<=a*b {(x, 1), (x, 4)} {(x, 4), (a, 3)} {(x, 1), (a, 3)} 3. a=a+1 4. x=a*b {(x, 1)} {(x, 4)} {(a, 3)} 5. goto 2 Spring 21 CSCI 4450/6450, A Milanova {} 0 32
Safe Solutions n In must problems the 1 is the empty set, and the merge operator is set intersection. n n n It is safe to err by saying that a fact does not reach a node when in fact it does E. g. , it is safe to err by saying that an expression is NOT AVAILABLE when it may be available; it is unsafe to err by adding an expression that is unavailable along some path Safe means “larger” than the MOP under our partial order. In must ≤ is superset, “safer” 33 solutions end up being smaller sets
Safe Solutions: Avail U = all expressions: {a*b, a+1, y*z} Poset is 2 U, ≤ is the superset relation {} 1 {a*b} {a+1} {y*z} {a*b, a+1} {a+1, y*z} 1. x: =a*b 2. if y*z<=a*b 3. a: =a+1 4. x: =a*b 5. goto 2 Spring 21 CSCI 4450/6450, A Milanova {a*b, a+1, y*z} 0 34
Precision of a Dataflow Solution n Precise solution is one that is “close” to MOP n n n A precise solution contains few spurious dataflow facts (spurious facts is what we call noise) Unfortunately, for most problems even the MOP (an approximation itself!) is undecidable MOP ≤ X ≤ Y: X is more precise than Y n n Usually, we can compare two solutions X and Y But, for most problems, we have no way of knowing the “ground truth” 35
Outline of Today’s Class n Dataflow frameworks, conclusion n Lattices (last class) Transfer functions Worklist algorithm n MOP solution vs. MFP solution n Non-distributive analyses n n Constant propagation Points-to analysis (next time) Spring 21 CSCI 4450/6450, A Milanova 36
Constant Propagation (Simple) n n Problem statement: Does variable x hold a constant value at a given program point 1. x=1 if (b>0) Example: in(1): x is not const out(1): x is 1 in(3): x is 1 in(2): x is 1 2. y=z+w x=2 3. y=0 out(3): x is 1 out(2): x is 2 Spring 21 CSCI 4450/6450, A Milanova in(4): x is NOT a const! 4. z=10*x 37
Fit Analysis in Dataflow Framework n If property space has desired properties is a lattice L, ≤ that satisfies the Ascending Chain Condition n merge operator V is the join of L and n n n Function space F: L L is monotone Then analysis fits the monotone dataflow framework and can be solved using the worklist algorithm Spring 21 CSCI 4450/6450, A Milanova 38
Constant Propagation: Property Space n Associate one of the following values with variable x at each program point value meaning 1 (or T) C 0 (or ) x is NOT a constant x has constant value C x is unknown Spring 21 CSCI 4450/6450, A Milanova 39
Constant Propagation: Lattice n Lattice Lx, ≤ T … -2 -1 0 1 2. . . T n Dataflow lattice L is the product lattice of Lx n n l 1, l 2 in L, l 1 ≤ l 2 iff l 1 x ≤ l 2 x for every variable x l 1 V l 2 amounts to l 1 x V l 2 x for every variable x Merge operator is join of L Does the product lattice satisfy the ACC? 40
Product Lattice E. g. , <x= , y=1, z=T>, <x=1, y=2, z=3>, etc. are lattice elements n E. g. , <x=1, y=2, z=T> ≤ <x=T, y=2, z=T> n E. g. , <x=1, y=3, z=T> V <x=T, y=2, z=T> = <T, T, T> n T Spring 21 CSCI 4450/6450, A Milanova 41
Constant Propagation: Transfer Functions n j: x = C fj: kill x val, generate x C j: x = y fj: kill x val, add x val’, s. t. y val’ in in(j). val and val’ are one of n n n : bottom (unknown) C: constant T: top (not a constant) T n Spring 21 CSCI 4450/6450, A Milanova 42
Constant Propagation: Transfer Functions n j: x = V 1 Op V 2 fj: kill: x val gen: If V 1 c 1 and V 2 c 2 in in(j), then x c 1 Op c 2 else if V 1 T or V 2 T in in(j), then x T else x T n Next, we’ll argue monotonicity which would give us that Constant Propagation is solvable by the Worklist algorithm 43
Example in(1) is T = <x T, y T, z T> 1. if (b>0) 2. x=1 y=2 out(2): <x 1, y 2, z T> in(4): <x T, y T, z T> out(4): <x T , y T, z T> 3. x=2 y=1 out(3): <x 2, y 1, z T> 4. z=x+y in(5): <x T, y T, z T> Spring 21 CSCI 4450/6450, A Milanova 5. w=10*z 44
Not Distributive! A Counter Example in(1) is T 1. if (b>0) f 4(f 2(f 1(T))) computes z 3 n f 4(f 3(f 1(T))) computes z 3 2. n Thus, MOP at 5 x=1 f 4(f 2(f 1(T))) V f 4(f 3(f 1(T))) y=2 out(2): x 1, y 2 computes z 3 n in(4): x T, y T out(4): z T MFP at 5 computes z T (i. e. , z is NOT a const) 3. x=2 y=1 out(3): x 2, y 1 4. z=x+y in(5): z T 5. w=10*z 45
More Product Lattices n Problem statement: Is integer variable x odd or even at program point n? x T, y T y=0 n x T, y even L x: if (x≥ 10) T odd x T, y even T F x=x+1 y=y+2 … T Spring 21 CSCI 4450/6450, A Milanova (Example program from MIT OCW Program Analysis) 46
More Product Lattices n n Problem statement: What sign does a variable hold at a given program point, i. e. , is it positive, negative, or 0 T L x: + 0 E. g. , < x=+, y=T, z=0 > T Spring 21 CSCI 4450/6450, A Milanova 47
Spring 21 CSCI 4450/6450, A Milanova 48
- Slides: 48