Control Flow Analysis Mooly Sagiv http www math

  • Slides: 36
Download presentation
Control Flow Analysis Mooly Sagiv http: //www. math. tau. ac. il/~sagiv/courses/pa. html Tel Aviv

Control Flow Analysis Mooly Sagiv http: //www. math. tau. ac. il/~sagiv/courses/pa. html Tel Aviv University 640 -6706 Sunday 18 -21 Scrieber 8 Monday 10 -12 Schrieber 317 Textbook Chapter 3 (Simplified+OO(

Goals u Understand the problem of Control Flow Analysis – in Functional Languages –

Goals u Understand the problem of Control Flow Analysis – in Functional Languages – In Object Oriented Languages – Function Pointers u Learn Constraint Based Program Analysis Technique – – General Usage for Control Flow Analysis Algorithms Systems u Similarities between Problems &Techniques

Outline u u u u u A Motivating Example (OO( The Control Flow Analysis

Outline u u u u u A Motivating Example (OO( The Control Flow Analysis Problem A Formal Specification Set Constraints Solving Constraints Adding Dataflow information Adding Context Information Back to the Motivating Example Conclusions

A Motivating Example class Vehicle Object { int position = 10; void move(x 1

A Motivating Example class Vehicle Object { int position = 10; void move(x 1 : int } ( position = position + x 1{{; class Car extends Vehicle { int passengers; void await(v : Vehicle} ( if (v. position < position ( then v. move(position - v. position; ( else self. move(10{{ ; ( class Truck extends Vehicle } void move(x 2 : int} ( if (x 2 < 55) position = position + x 2{{ ; void main { Car c; Truck t; Vehicle v 1; new c ; new t ; v 1 : = c; c. passangers : = 2; c. move(60; ( v 1. move(70; ( c. await(t{; (

The Control Flow Analysis (CFA) Problem u Given a program in a functional programming

The Control Flow Analysis (CFA) Problem u Given a program in a functional programming language with higher order functions (functions can serve as parameters and return values( u Find out for each function invocation which functions may be applied u Obvious in C without function pointers u Difficult in C++, Java and ML u The Dynamic Dispatch Problem

An ML Example let f = fn x => x 1; g = fn

An ML Example let f = fn x => x 1; g = fn y => y + 2; h = fn z => z + 3; in (f g) + (f h(

An ML Example let f = fn x => /* {g, h} */ x

An ML Example let f = fn x => /* {g, h} */ x 1; g = fn y => y + 2; h = fn z => z + 3; in (f g) + (f h(

u Notations – – – The Language FUN e Exp // expressions (or labeled

u Notations – – – The Language FUN e Exp // expressions (or labeled terms( t Term // terms (or unlabeled terms( f, x Var // variables c Const // Constants op Op // Binary operators l Lab // Labels u Abstract Syntax – e : : = tl – t : : = c | x | fn x e // function definition | fun f x e // recursive function definition | e 1 e 2 // function applications | if e 0 then e 1 else e 2 | let x = e 1 in e 2 | e 1 op e 2

A Simple Example ))fn x x 1)2 (fn y y 3)4)5

A Simple Example ))fn x x 1)2 (fn y y 3)4)5

An Example which Loops )let g = fun f x (f 1 (fn y

An Example which Loops )let g = fun f x (f 1 (fn y y 2)3)4 )5 )g 6 (fn z z 7)8)9 )10

The 0 -CFA Problem u Compute for every program a pair (C, ) where:

The 0 -CFA Problem u Compute for every program a pair (C, ) where: – C is the abstract cache associating abstract values with labeled program points – is the abstract environment associating abstract values with variables u Formally – – v Val = P(Term) // Abstract values Env = Var Val // Abstract environment C Cache - Lab Val // Abstract Cache For function application (t 1 l 1 t 2 l 2)l C(l 1) determine the function that can be applied u These maps are finite for a given program u No context is considered for parameters

Possible Solutions for ((fn x x 1)2 (fn y y 3)4)5

Possible Solutions for ((fn x x 1)2 (fn y y 3)4)5

)let g = fun f x (f 1 (fn y y 2)3)4 )5 )g

)let g = fun f x (f 1 (fn y y 2)3)4 )5 )g 6 (fn z z 7)8)9 )10 Shorthand sf fun f x (f 1 (fn y y 2)3)4 idy fn y y 2 idz fn z z 7 C(1) = {sf} C(2) = {} C(3) = {idy{ C(4) = {} C(5) = {sf} C(6) = {sf{ C(7) = {} C(8) = {idy} C(9{} = ( C(10) = {} (x) = {idy , idy } (z {} = ( (y{} = (

Relationship to Dataflow Analysis u Expressions are side effect free – no entry/exit u.

Relationship to Dataflow Analysis u Expressions are side effect free – no entry/exit u. A single environment u Represents information at different points via maps u A single value for all occurrences of a variable u Function applications act similar to assignments – “Definition” - Function abstraction is created – “Use” - Function is applied

A Formal Specification of 0 -CFA Boolean function define when a solution is acceptable

A Formal Specification of 0 -CFA Boolean function define when a solution is acceptable u )C, ) e means that (C, ) is acceptable for the expression e u Define by structural induction on e u Every function is analyzed once u Every acceptable solution is sound (conservative( u Many acceptable solutions u Generate a set of constraints u Obtain the least acceptable solution by solving the constraints u. A

Syntax Directed 0 -CFA (Simple Expressions( ]const] (C, ) cl ]var] (C, ) xl

Syntax Directed 0 -CFA (Simple Expressions( ]const] (C, ) cl ]var] (C, ) xl always if (x) C (l(

Syntax Directed 0 -CFA Function Abstraction ]fn] (C, ) (fn x e)l if: )C,

Syntax Directed 0 -CFA Function Abstraction ]fn] (C, ) (fn x e)l if: )C, ) e fn x e C(l( ] fun] (C, ) (fun f x e)l if: ) C, ) e fun x e C(l( fun x e (f(

Syntax Directed 0 -CFA Function Application ]app] (C, ) (t 1 l 1 t

Syntax Directed 0 -CFA Function Application ]app] (C, ) (t 1 l 1 t 2 l 2)l if: )C, ) t 1 l 1 )C, ) t 2 l 2 for all fn x t 0 l 0 C(l : ( C (l 2) (x) C(l 0) C(l( for all fun x t 0 l 0 C(l : ( C (l 2) (x) C(l 0) C(l(

Syntax Directed 0 -CFA Other Constructs ]if] (C, ) (if t 0 l 0

Syntax Directed 0 -CFA Other Constructs ]if] (C, ) (if t 0 l 0 then t 1 l 1 else t 2 l 2)l )C, ) t 0 l 0 )C, ) t 1 l 1 )C, ) t 2 l 2 C(l 1) C(l( C(l 2) C(l( ]let] (C, ) (let x = t 1 l 1 in t 2 l 2)l if: )C, ) t 1 l 1 )C, ) t 2 l 2 C(l 1) (x ( C(l 2) C(l( ]op] (C, ) (t 1 l 1 op t 2 l 2)l if: ) C, ) t 1 l 1 )C, ) t 2 l 2 if:

Possible Solutions for ((fn x x 1)2 (fn y y 3)4)5

Possible Solutions for ((fn x x 1)2 (fn y y 3)4)5

Set Constraints u. A set of rules of the form: – lhs rhs –

Set Constraints u. A set of rules of the form: – lhs rhs – }t} rhs’ lhs rhs (conditional constraint( – lhs, rhs’ are » terms » C(l( » (x( u The least solution (C, ) can be found iterativelly – start with empty sets – add terms when needed u Efficient cubic graph based solution

Syntax Directed Constraint Generation (Part I( C* cl {} = C* xl = {

Syntax Directed Constraint Generation (Part I( C* cl {} = C* xl = { (x) C (l{( C* (fn x e)l = C* e { {fn x e} C(l{( C* (fun x e)l = C* e { {fun x e} C(l)} {{fun x e} ( f{( C* (t 1 l 1 t 2 l 2)l = C* t 1 l 1 C* t 2 l 2 }}t} C(l) C (l 2) (x) | t=fn x t 0 l 0 Term* } }} t} C(l) C (l 0) C (l) | t=fn x t 0 l 0 Term* } {{t} C(l) C (l 2) (x) | t=fun x t 0 l 0 Term* } }}t} C(l) C (l 0) C (l) | t=fun x t 0 l 0 Term { *

Syntax Directed Constraint Generation (Part II( C* (if t 0 l 0 then t

Syntax Directed Constraint Generation (Part II( C* (if t 0 l 0 then t 1 l 1 else t 2 l 2)l = C* t 0 l 0 C* t 1 l 1 C* t 2 l 2 }C(l 1) C (l)} }C(l 2) C (l{( C* (let x = t 1 l 1 in t 2 l 2)l = C* t 1 l 1 C* t 2 l 2 }C(l 1) (x)} }C(l 2) C(l{( C* (t 1 l 1 op t 2 l 2)l = C* t 1 l 1 C* t 2 l 2

Set Constraints for ((fn x x 1)2 (fn y y 3)4)5

Set Constraints for ((fn x x 1)2 (fn y y 3)4)5

Iterative Solution to the Set Constraints for ((fn x x 1)2 (fn y y

Iterative Solution to the Set Constraints for ((fn x x 1)2 (fn y y 3)4)5

Adding Data Flow Information u Dataflow u Example values can affect control flow analysis

Adding Data Flow Information u Dataflow u Example values can affect control flow analysis (let f = (fn x (if (x 1 > 02)3 then (fn y y 4)5 else (fn z 56)7 )8 )9 in ((f 10 311)12 013)14)15

Adding Data Flow Information u Add a finite set of “abstract” values per program

Adding Data Flow Information u Add a finite set of “abstract” values per program Data u Update Val = P(Term Data ( – Env = Var Val // Abstract environment – C Cache - Lab Val // Abstract Cache u Generate extra constraints for data u Obtained a more precise solution u A special of case of product domain (4. 4( u The combination of two analyses may be more precise than both

Adding Dataflow Information (Sign Analysis( u Sign analysis u Add a finite set of

Adding Dataflow Information (Sign Analysis( u Sign analysis u Add a finite set of “abstract” values per program Data = {P, N, TT, FF{ u Update Val = P(Term Data ( u dc is the abstract value that represents a constant c d 3 = {p{ – d-7= {n{ – dtrue= {tt{ – dfalse= {ff{ – u Every operator is conservatively interpreted

Syntax Directed Constraint Generation (Part I( C* cl = dc C (l{( C* xl

Syntax Directed Constraint Generation (Part I( C* cl = dc C (l{( C* xl = { (x) C (l{( C* (fn x e)l = C* e { {fn x e} C(l{( C* (fun x e)l = C* e { {fun x e} C(l)} {{fun x e} ( f{( C* (t 1 l 1 t 2 l 2)l = C* t 1 l 1 C* t 2 l 2 }}t} C(l) C (l 2) (x) | t=fn x t 0 l 0 Term* } }} t} C(l) C (l 0) C (l) | t=fn x t 0 l 0 Term* } {{t} C(l) C (l 2) (x) | t=fun x t 0 l 0 Term* } }}t} C(l) C (l 0) C (l) | t=fun x t 0 l 0 Term { *

Syntax Directed Constraint Generation (Part II( C* (if t 0 l 0 then t

Syntax Directed Constraint Generation (Part II( C* (if t 0 l 0 then t 1 l 1 else t 2 l 2)l = C* t 0 l 0 C* t 1 l 1 C* t 2 l 2 }dt C (l 0) C(l 1) C (l)} }df C (l 0) C(l 2) C (l{( C* (let x = t 1 l 1 in t 2 l 2)l = C* t 1 l 1 C* t 2 l 2 }C(l 1) (x)} }C(l 2) C(l{( C* (t 1 l 1 op t 2 l 2)l = C* t 1 l 1 C* t 2 l 2 {C(l 1) op C(l 2) C(l{(

Adding Context Information u The analysis does not distinguish between different occurrences of a

Adding Context Information u The analysis does not distinguish between different occurrences of a variable (Monovariant analysis( u Example (let f = (fn x x 1) 2 in ((f 3 f 4)5 (fn y y 6) 7)8)9 u Source to source can help (but may lead to code explosion( u Example rewritten let f 1 = fn x 1 in let f 2 = fn x 2 in (f 1 f 2) (fn y y(

Simplified K-CFA u Records the last k dynamic calls (for some fixed k( u

Simplified K-CFA u Records the last k dynamic calls (for some fixed k( u Similar to the call string approach u Remember the context in which expression is evaluated u Val is now P(Term) Contexts – Env = Var Contexts Val – C Cache - Lab Contexts Val

1 -CFA f = (fn x x 1) 2 in ((f 3 f 4)5

1 -CFA f = (fn x x 1) 2 in ((f 3 f 4)5 (fn y y 6) 7)8)9 u Contexts u )let – - [] The empty context – [5]The application at label 5 – [8]The application at label 8 u Polyvariant Control Flow C(1, [5]) = (x, 5)= C(2, []) = C(3, []) = (f, []) = ({(fn x x 1)}, [] ) C(1, [8]) = (x, 8)= C(7, []) = C(8, []) = C(9, []) = ({(fn y y 6 ( [] , {(

The Motivating Example class Vehicle Object { int position = 10; void move(x 1

The Motivating Example class Vehicle Object { int position = 10; void move(x 1 : int } ( position = position + x 1{{; class Car extends Vehicle { int passengers; void await(v : Vehicle} ( if (v. position < position ( then v. move(position - v. position; ( else self. move(10{{ ; ( class Truck extends Vehicle } void move(x 2 : int} ( if (x 2 < 55) position = position + x 2{{ ; void main { Car c; Truck t; Vehicle v 1; new c ; new t ; v 1 : = c; c. passangers : = 2; c. move(60; ( v 1. move(70; ( c. await(t{; (

Missing Material u Efficient u Cubic Solution to Set Constraints www. cs. berkeley. edu/Research/Aiken/bane.

Missing Material u Efficient u Cubic Solution to Set Constraints www. cs. berkeley. edu/Research/Aiken/bane. html Experimental results for OO www. cs. washington. edu/research/projects/cecil u Operational Semantics for FUN (3. 2. 1( u Defining acceptability without structural induction – More precise treatment of termination (3. 2. 2( – Needs Co-Induction (greatest fixed point( u Using general lattices as Dataflow values instead of powersets (3. 5. 2( u Lower-bounds – Decidability of JOP – Polynomiality

Conclusions u Set constraints are quite useful – A Uniform syntax – Can even

Conclusions u Set constraints are quite useful – A Uniform syntax – Can even deal with pointers u But semantic foundation is still based on abstract interpretation u Techniques used in functional and imperative (OO) programming are similar u Control and data flow analysis are related