Program Analysis Mooly Sagiv http www cs tau

  • Slides: 48
Download presentation
Program Analysis Mooly Sagiv http: //www. cs. tau. ac. il/~msagiv/courses/pa 01. html Tel Aviv

Program Analysis Mooly Sagiv http: //www. cs. tau. ac. il/~msagiv/courses/pa 01. html Tel Aviv University 640 -6706 Textbook: Principles of Program Analysis Chapter 1. 1 -4 1

Goals u Sketch existing program analysis techniques u Fixed point theory in the nutshell

Goals u Sketch existing program analysis techniques u Fixed point theory in the nutshell u Relationship between program analysis and operational semantics

Outline u The Nature of Program Analysis u Setting the Scene – The While

Outline u The Nature of Program Analysis u Setting the Scene – The While language – Reaching Definitions u Program – – – – Analysis Techniques Data Flow Analysis - the equational approach The Constraint Based Approach Mathematical background Next Abstract Interpretation week Type and Effect Systems Algorithms Transformations

The Nature of Program Analysis u Compile-time techniques for predicating safe and computable approximations

The Nature of Program Analysis u Compile-time techniques for predicating safe and computable approximations to the behaviors arising at runtime when executing a program u Differences with operational semantics – The input state is not usually known at compile-time – The compiler must always terminate (fast) – The compiler can generate suboptimal code

The Nature of Program Analysis Erring on the Safe Side true-answer {d 1, d

The Nature of Program Analysis Erring on the Safe Side true-answer {d 1, d 2, …, dn d. N} {d 1, d 2, …, dn+1, … dn+m , d. N} safe-answer

Example void main() { int y, z; read(x); if (x>0) then y = 1;

Example void main() { int y, z; read(x); if (x>0) then y = 1; else { y = 2; f() ; /* f does not change y */ } /* y {1, 2} z = y; } */

Example void main() { int y, z; read(x); if (x>0) then y = 1;

Example void main() { int y, z; read(x); if (x>0) then y = 1; else { y = 2; f() ; /* f does not change y */ } /* y {1} z = y; } */

Example void main() { int y, z; read(x); if (x>0) then y = 1;

Example void main() { int y, z; read(x); if (x>0) then y = 1; else { y = 2; f() ; /* f does not change y */ } /* y {1, 2, 27} */ z = y; }

Semantics Based Program Analysis u Information obtained can be proved safe (or correct) w.

Semantics Based Program Analysis u Information obtained can be proved safe (or correct) w. r. t. operational semantics u Earlier detection of conceptual compiler bugs u But not committing to semantics directed program analysis – The structure of the program analysis algorithm need reflect the structure of the semantics

The While Programming Language Revisited Syntactical Categories x, y Var program variables u n

The While Programming Language Revisited Syntactical Categories x, y Var program variables u n Num program numerals u a Aexp arithmetic expressions u b Bexp Boolean expressions u Stm set of program statements u l Lab set of program labels us u u u opa Opa arithmetic operators opb Opb Boolean operators opr Opb relational operators

The While Programming Language Revisited Abstract Syntax a : = x | n |

The While Programming Language Revisited Abstract Syntax a : = x | n | a 1 opa a 2 b : = true | false | not b | b 1 opb b 2 | a 1 opr a 2 S : = [x : = a]l | [skip] l | S 1 ; S 2 | if [b]l then S 1 else S 2 | while [b]l do S

The Factorial Program [y : = x]1; [z : = 1]2; while [y>1]3 do

The Factorial Program [y : = x]1; [z : = 1]2; while [y>1]3 do ( [z: = z * y]4; [y : = y - 1]5; ) [y : = 0]6;

Example Program Analysis Problem Reaching Definitions u An assignment (definition) of the form [x

Example Program Analysis Problem Reaching Definitions u An assignment (definition) of the form [x : = a] l may reach an elementary block l’ if – there is execution of the program that leads to l' where x was last assigned at l

Reaching Definitions in Factorial

Reaching Definitions in Factorial

Usage of Reaching Definitions u Compiler optimizations – An occurrence of a variable x

Usage of Reaching Definitions u Compiler optimizations – An occurrence of a variable x in in an elementary block l is constant n if all in the reaching definitions (x, l'), l' assigns n to x – Loop invariant code motion – Program dependence graphs u Software quality tools – A usage of a variable x in an elementary block may be uninitialized if. . . – Program slicing

Soundness in Reaching Definitions u Every reachable definition is detected u May include more

Soundness in Reaching Definitions u Every reachable definition is detected u May include more definitions – Less constants may be identified – Not all the loop invariant code will be identified – May warn against uninitailzed variables that are in fact in initialized u But never miss a reaching definition – All constants are indeed such – Never move a non invariant code – Never miss an error

Program Analysis Techniques u Find sound solutions u Data Flow Analysis - the equational

Program Analysis Techniques u Find sound solutions u Data Flow Analysis - the equational approach u Abstract Interpretation u The Constraint Based Approach u Type and Effect Systems

The Dataflow Analysis Approach u Generate a system of equations u Find the least

The Dataflow Analysis Approach u Generate a system of equations u Find the least solution in one of the following ways – Start with the minimum element and iterate until no more changes occur – Eliminate equations until every value is expressed in terms of the initial dataflow value when the program begins (not studied in this course)

Equations Generated for Reaching Definitions u Equations for elementary statements – [skip]l RDexit(1) =

Equations Generated for Reaching Definitions u Equations for elementary statements – [skip]l RDexit(1) = RDentry(l) – [b]l RDexit(1) = RDentry(l) – [x : = a]l RDexit(1) = (RDentry(l) - {(x, l’) | l’ u Lab }) {(x, l)} Equations for control flow constructs RDentry(l) = RDexit(l’) l’ immediately precedes l in the control flow graph u An equation for the entry RDentry(1) = {(x, ? ) | x is a variable in the program}

[y : = x]1; RDentry(1)={(x, ? ), (y, ? ), (z, ? )} RDexit(1)=

[y : = x]1; RDentry(1)={(x, ? ), (y, ? ), (z, ? )} RDexit(1)= RDentry(1)-{(y, l’) | l’ Lab }) {(y, 1)} [z : = 1]2; [y>1]3 RDentry(2) = RDexit(1) RDexit(2)= RDentry(2)-{(z, l’) | l’ Lab }) {(z, 2)} RDentry(3) = RDexit(2) RDexit(5) RDexit(3) = RDentry(3) RDentry(4) = RDexit(3) [z: = z * y]4 RDexit(4)= RDentry(4)-{(z, l’) | l’ Lab }) {(z, 4)} RDentry(5) = RDexit(4) [y : = y - 1]5; RDexit(5)= RDentry(5)-{(y, l’) | l’ Lab }) {(y, 5)} RDentry(6) = RDexit(6) [y : = 0]6; RDexit(6)= RDentry(6)-{(y, l’) | l’ Lab }) {(y, 6)}

The Least Solution u 12 equations over sets RDentry(1), …, RDexit (6) u Can

The Least Solution u 12 equations over sets RDentry(1), …, RDexit (6) u Can be written in vectorial form u Find the minimum solution u Every component is minimal u Since F is monotonic such a solution always exists u Since the number of definitions is finite it is possible to compute the minimum solution iteratively

Chaotic Computation of the Least Solution Initialize RDentry(1): ={(x, ? ), (y, ? ),

Chaotic Computation of the Least Solution Initialize RDentry(1): ={(x, ? ), (y, ? ), (z, ? )}; RDexit (1): = RDentry(2): = ; RDentry(3): = ; RDentry(4): = ; RDexit (2): = RDexit (3): = RDexit(4) : = RDentry (5): = ; RDexit(5): = RDentry (6): = ; RDexit(6): = WL = {1, 2, 3, 4, 5, 6} while WL != do select and remove an l from WL new : = FRDexit(l)(…) if (new != RDexit(l)) then RDexit(l) : = new for all l’ such that RDexit(l) is used in FRDentry(l’)(…) do RDentry(l’) : = RDentry(l’) new WL : = WL {l’}

The Constraint Based Approach u Generate a system of set inclusions X Y u

The Constraint Based Approach u Generate a system of set inclusions X Y u Fits very well with functional and object oriented programming languages in which the control flow graph is not immediately derived from the syntax u Find the least solution

Constraints Generated for Reaching Definitions u Constrains for elementary statements – [skip]l RDexit(1) –

Constraints Generated for Reaching Definitions u Constrains for elementary statements – [skip]l RDexit(1) – [b]l RDentry(l) RDexit(1) – [x : = a]l RDentry(l) (RDentry(l) - {(x, l) | l Lab }) {(x, l)} Equations for control flow constructs RDentry(l) RDexit(l’) RDexit(1) u – l’ immediately precedes l in the control flow graph u An equation for the entry RDentry(1) {(x, ? ) | x is a variable in the program}

Constraint vs. Equations Reaching Definitions u Every solution to the system of equations is

Constraint vs. Equations Reaching Definitions u Every solution to the system of equations is a solution to the set of constraints RDexit(1) (RDentry(l) - {(x, l) | l Lab }) RDexit(1) {(x, l)} RDexit(1) (RDentry(l) - {(x, l) | l u u u Lab }) {(x, l)} But some solutions to the set of constraints are not solutions to the system of equations The least solution is the same The connection between constraints and equations is not always obvious

The Control Flow Analysis Problem u Given a program in a functional programming language

The Control Flow Analysis Problem u Given a program in a functional programming language with higher order functions (functions can serve as parameters and return values( u Find out for each function invocation which functions may be applied u Obvious in C without function pointers u Difficult in C++, Java and ML

An ML Example let f = fn x => x 1 ; g =

An ML Example let f = fn x => x 1 ; g = fn y => y + 2 ; h = fn z => z + 3; in (f g) + (f h)

An ML Example let f = fn x => /* {g, h} */ x

An ML Example let f = fn x => /* {g, h} */ x 1 ; g = fn y => y + 2 ; h = fn z => z + 3; in (f g) + (f h)

Control Flow Analysis (pure)ML u Find out for every formal argument x the set

Control Flow Analysis (pure)ML u Find out for every formal argument x the set of expressions that may be bound to x in some execution u Analyze all function invocations u Generate a set of constraints – Label every program sub-expression – The Control Flow Analysis Algorithm needs to find a pair (C, p) where » C(l) is a superset of the potential sub-expressions that can occur at l » p(x) is a superset of the potential sub-expressions that x can be bound to u Generate constraints for (C, p)

Simplified Example let f = [ fn x => [[x]1 1]2]3; g = [

Simplified Example let f = [ fn x => [[x]1 1]2]3; g = [ fn y =>[[y]4 + 2]5]6; h = [ fn z =>[[z]7 + 3]8]9; in [f h] 10

Simplified Constraints let f = [ fn x => [[x]1 1]2]3; g = [

Simplified Constraints let f = [ fn x => [[x]1 1]2]3; g = [ fn y =>[[y]4 + 2]5]6; h = [ fn z =>[[z]7 + 3]8]9; in [f h] 10 C(1) { [x]1} C(2) { [[x]1 1]2 … C(10) {[f h]10} C(1) p(x) C(4) p(y) C(7) p(z) p(x) C(9) C(10) C(3)

OO Example Class Vehicle extends object { int position; public Vehicle(int start) { position

OO Example Class Vehicle extends object { int position; public Vehicle(int start) { position = start; } Class Truck extends Vehicle { void move(int x) { if (x <= 55) void move(int x) { position += x ; } } Class Car extends Vehicle { int passengers; public Car(int start, int pass) { position = start; passengers =pass; } void await(Vehicle v) { if (v. position < position) } then position+= x; } } Class Main { Truck t = new Truck(10); Car c = new Car(10, 2); Vehicle v = c ; c. move(60); then v. move(position – v. position)} v. move(70); else this. move(10); c. await(t); …

OO Example Class Vehicle extends object { int position; public Vehicle(int start) { position

OO Example Class Vehicle extends object { int position; public Vehicle(int start) { position = start; } Class Truck extends Vehicle { void move(int x) { if (x <= 55) void move(int x) { position += x ; } } Class Car extends Vehicle { int passengers; public Car(int start, int pass) { position = start; passengers =pass; } void await(Vehicle v) { if (v. position < position) } then position+= x; } } Class Main { Truck t = new Truck(10); Car c = new Car(10, 2); Vehicle v = c ; c. move(60); then v. move(position – v. position)} v. move(70); else this. move(10); c. await(t); …

Conclusions u Two similar techniques – Find a minimal solution to a system of

Conclusions u Two similar techniques – Find a minimal solution to a system of equations – Find a minimal solution to a set of constraints u Next week – Mathematical foundation – Semantic foundation