ContextSensitivity Analysis Literature Review by Jos Nelson Amaral
Context-Sensitivity Analysis Literature Review by José Nelson Amaral (amaral@cs. ualberta. ca) University of Alberta
Dimensions of Pointer Analysis • • Unification-based × Insertion-based Flow-sensitive × flow-insensitive Field-sensitive × field-insensitive × field-based Context-sensitive × context-insensitive
Andersen’s X Steensgaard’s (Example) Insertion X Unification Program: a = &b; Steensgaard: S = {(a, b)} a b Andersen: S = {(a, b)} b a CMPUT 680 - Compiler Design and Optimization 3 After (Shapiro/Horwitz, PPL 97)
Andersen’s X Steensgaard’s (Example) Program: a = &b; b = &c; Steensgaard: S = {(a, b); (b, c)} a b c Andersen: S = {(a, b); (b, c)} a CMPUT 680 - Compiler Design and Optimization 4 After (Shapiro/Horwitz, PPL 97)
Andersen’s X Steensgaard’s (Example) Program: a = &b; b = &c; if(cond) a = &d; Steensgaard: S = {(a, b); (b, c)} a b c Andersen: S = {(a, b); (b, c)} b c What should happen in each analysis? a CMPUT 680 - Compiler Design and Optimization 5 After (Shapiro/Horwitz, PPL 97)
Andersen’s X Steensgaard’s (Example) Program: a = &b; b = &c; if(cond) a = &d; Steensgaard: S = {(a, b); (b, c); (a, d); (d, c)} a (b, d) c b c Andersen: S = {(a, b); (b, c); (a, d)} a d CMPUT 680 - Compiler Design and Optimization 6 After (Shapiro/Horwitz, PPL 97)
Andersen’s X Steensgaard’s (Example) Program: a = &b; b = &c; if(cond) a = &d; d = &e; Steensgaard: S = {(a, b); (b, c); (a, d); (d, c)} a (b, d) c Andersen: And now? S = {(a, b); (b, c); (a, d)} b c a d CMPUT 680 - Compiler Design and Optimization 7 After (Shapiro/Horwitz, PPL 97)
Andersen’s X Steensgaard’s (Example) Program: a = &b; b = &c; if(cond) a = &d; d = &e; Steensgaard: S = {(a, b); (b, c); (a, d); (d, c); (d, e); (b, e)} a (b, d) (c, e) b c d e Andersen: S = {(a, b); (b, c); (a, d); (d, e)} a CMPUT 680 - Compiler Design and Optimization 8 After (Shapiro/Horwitz, PPL 97)
Flow-sensitive X Flow-insensitive (Example) a Program: a = &b; b = &c; if(cond) a = &d; d = &e; a Strong update: Not only a now points to d, but also a no longer points to b b b a c b c d Insertion based a b Unification based c d a b c e Design and CMPUT d 680 - Compiler Optimization a b, d c, e
Flow-sensitivity in SSA (incomplete slide) All variables that had their address taken must have an “access path” which is their address. pb =Program: &b; pc = &c; pd = &d; pe = &e; They can only be referenced through their access paths. pb a 2 pc a 0 b c a 1 d e a 0 = pb; *pb = pc; pd pe if(cond) a 1 = pd; In SSA flow-sensitive information can be a 2 = phi(a 0, a 1, FALSE, TRUE); obtained from the single graph above. *pd = pe; CMPUT 680 - Compiler Design and Optimization
Field-insensitive × Field-based × Field-sensitive analysis • Field insensitive: Each aggregate object modeled by a single abstract variable. • Field-based: An abstract variable models all instances of a field of an aggregate type. • Field-sensitive: Unique abstract variable models each field of each aggregate object. (Pearce. Kelly. Hankin. TOPLAS 07)
Field Sensitivity (Example) Program: typedef struct{ Program: int *f 1; int *f 2; } aggr; aggr a, b; Assume a flow insensitive, insertion-based analysis. Field Insensitive a d Field Based f 1 d Field Sensitive af 1 d int *c, d, e, f; a. f 1 = &d; (Pearce. Kelly. Hankin. TOPLAS 07)
Field Sensitivity (Example) Program: typedef struct{ Program: int *f 1; int *f 2; } aggr; aggr a, b; int *c, d, e, f; Assume a flow insensitive, insertion-based analysis. Field Insensitive a d Field Based f 1 d Field Sensitive af 1 d f f f a. f 1 = &d; a. f 2 = &f; (Pearce. Kelly. Hankin. TOPLAS 07)
Field Sensitivity (Example) Program: typedef struct{ Program: int *f 1; int *f 2; } aggr; aggr a, b; int *c, d, e, f; Assume a flow insensitive, insertion-based analysis. Field Insensitive a d Field Based Field Sensitive f 1 d af 1 d f 2 f af 2 f f a. f 1 = &d; a. f 2 = &f; (Pearce. Kelly. Hankin. TOPLAS 07)
Field Sensitivity (Example) Program: typedef struct{ int *f 1; int *f 2; } aggr; aggr a, b; int *c, d, e, f; a. f 1 = &d; a. f 2 = &f; b. f 1 = &e; Assume a flow insensitive, insertion-based analysis. Field Insensitive a d Field Based f 1 af 1 d af 2 f e d Field Sensitive f 2 f e (Pearce. Kelly. Hankin. TOPLAS 07)
Field Sensitivity (Example) Program: typedef struct{ int *f 1; int *f 2; } aggr; aggr a, b; Assume a flow insensitive, insertion-based analysis. Field Insensitive a a. f 1 = &d; a. f 2 = &f; b. f 1 = &e; f 1 e d Field Sensitive af 1 d af 2 f bf 1 e e f b int *c, d, e, f; d Field Based f 2 f (Pearce. Kelly. Hankin. TOPLAS 07)
Field Sensitivity (Example) Program: typedef struct{ int *f 1; int *f 2; } aggr; aggr a, b; int *c, d, e, f; a. f 1 = &d; a. f 2 = &f; b. f 1 = &e; c = a. f 1; Assume a flow insensitive, insertion-based analysis. Field Insensitive Field Based Field Sensitive a d f 1 d af 1 c f c e c b e f 2 f af 2 f bf 1 e d (Pearce. Kelly. Hankin. TOPLAS 07)
Field Sensitivity (Example) Program: typedef struct{ int *f 1; int *f 2; } aggr; aggr a, b; int *c, d, e, f; a. f 1 = &d; a. f 2 = &f; b. f 1 = &e; c = a. f 1; Assume a flow insensitive, insertion-based analysis. Field Insensitive Field Based Field Sensitive a d f 1 d af 1 c f c e c b e f 2 f af 2 f bf 1 e d (Pearce. Kelly. Hankin. TOPLAS 07)
Field Sensitivity in C • A field-sensitive analysis for C is fundamentally harder than a field-sensitive analysis for Java: – C allows the address of a field to be taken • Existing field-sensitive analysis for C: – Yong. Horwitz. Reps. PLDI 99; – Chandra. Reps. PASTE 99; – Johnson. Wagner. USENIX 04; – Pearce. Kelly. Hankin. TOPLAS 07;
What context-sensitivity means? • Context-sensitive analysis: “the effects of a procedure call are estimated within a specific calling context” • Context-insensitive analysis: “the effects of a procedure call summarizes the information for all calling contexts. ” (Emami. Ghya. Hendren. PLDI 94)
Another definition • “A context-insensitive (CI) algorithm does not distinguish the different calling contexts of a procedure, whereas a context-sensitive (CS) does. ” (Zhu. Calman. PLDI 04) • “CS treats multiple calls to a single procedure independently. ” (Ruf. PLDI 95) • “CI constructs a single approximation to a procedure’s effect on all of its callers. ” (Ruf. PLDI 95)
Alternative definition: The calling context problem • The calling context problem is “the problem of correctly accounting for the calling context of a called procedure. ” Horowitz. Reps. Blinkey. TOPLAS 90
A more strict definition • “A precise CS analysis yields results as precise as if they were computed on a modified program with all method calls inlined. ” – Requires a context-sensitive heap abstraction: • a separate abstraction is needed for each copy of an allocation statement – Virtual call targets must be computed contextsensitively • separately for each calling context; • using precise points-to information; Sridharan. Bodik. PLDI 06
Context-Sensitive Example • Two calls to a function foo produce different return values because of the points-to set at the point immediately before each call to foo. – In other words, the return value of foo changes depending on the context within which foo is invoked.
#include <stdlib. h> typedef int arr[10000]; arr a 1, a 2, a 3; int cond 1, cond 2; int *foo (int **p 2, int **p 3){ int *t; if(cond 2){ t = *p 2; *p 2 = *p 3; *p 3 = t; } return *p 2; } int main(int argc, char *argv[]){ int *x 1, *x 2, *x 3, *y 1, *y 2, *y 3; int *lp, *lq, r; cond 1 = argc-1; cond 2 = argc-2; a 1[0] = argc; a 2[0] = argc+1; a 3[0] = argc+2; x 1 = a 1; x 2 = a 2; x 3 = a 3; y 1 = a 1; y 2 = a 2; y 3 = a 3; if(cond 1){ x 1 = a 2; x 2 = a 1; } lp = foo(&x 2, &x 3); lq = foo(&y 2, &y 3); return (*lp + *lq); } Context-sensitive example
#include <stdlib. h> typedef int arr[10000]; arr a 1, a 2, a 3; int cond 1, cond 2; int *foo (int **p 2, int **p 3){ int *t; if(cond 2){ t = *p 2; *p 2 = *p 3; foo *p 3 = t; } return *p 2; } int main(int argc, char *argv[]){ int *x 1, *x 2, *x 3, *y 1, *y 2, *y 3; int *lp, *lq, r; cond 1 = argc-1; cond 2 = argc-2; a 1[0] = argc; a 2[0] = argc+1; a 3[0] = argc+2; x 1 = a 1; x 2 = a 2; x 3 = a 3; y 1 = a 1; y 2 = a 2; y 3 = a 3; if(cond 1){ x 1 = a 2; x 2 = a 1; } lp = foo(&x 2, &x 3); lq = foo(&y 2, &y 3); return (*lp + *lq); } Context-sensitive example Is there an algorithm that “gets” this example? • Emami, Ghiya, and Hendren (PLDI 94) should get it. • We need to study the points -to sets that the algorithm computes at points P 1, P 2, and P 3. P 1 P 2 P 3
#include <stdlib. h> typedef int arr[10000]; arr a 1, a 2, a 3; int cond 1, cond 2; int *foo (int **p 2, int **p 3){ int *t; if(cond 2){ t = *p 2; *p 2 = *p 3; foo *p 3 = t; } return *p 2; } int main(int argc, char *argv[]){ int *x 1, *x 2, *x 3, *y 1, *y 2, *y 3; int *lp, *lq, r; cond 1 = argc-1; cond 2 = argc-2; a 1[0] = argc; a 2[0] = argc+1; a 3[0] = argc+2; x 1 = a 1; x 2 = a 2; x 3 = a 3; y 1 = a 1; y 2 = a 2; y 3 = a 3; if(cond 1){ x 1 = a 2; x 2 = a 1; } lp = foo(&x 2, &x 3); lq = foo(&y 2, &y 3); return (*lp + *lq); } Context-sensitive example • In the following animation: x y x definitely points to y (variable x contains the address of variable y) x probably points to y P 1 P 2 P 3 (arrows are colored only for convenience in the animation, they represent new points-to relations that were not in the previous slide)
#include <stdlib. h> typedef int arr[10000]; arr a 1, a 2, a 3; int cond 1, cond 2; int *foo (int **p 2, int **p 3){ int *t; if(cond 2){ t = *p 2; *p 2 = *p 3; foo *p 3 = t; } return *p 2; } int main(int argc, char *argv[]){ int *x 1, *x 2, *x 3, *y 1, *y 2, *y 3; int *lp, *lq, r; cond 1 = argc-1; cond 2 = argc-2; a 1[0] = argc; a 2[0] = argc+1; a 3[0] = argc+2; x 1 = a 1; x 2 = a 2; x 3 = a 3; y 1 = a 1; y 2 = a 2; y 3 = a 3; if(cond 1){ x 1 = a 2; x 2 = a 1; } lp = foo(&x 2, &x 3); lq = foo(&y 2, &y 3); return (*lp + *lq); } Context-sensitive example x 1 x 2 x 3 a 1 a 2 a 3 y 1 y 2 y 3 P 1?
#include <stdlib. h> typedef int arr[10000]; arr a 1, a 2, a 3; int cond 1, cond 2; int *foo (int **p 2, int **p 3){ int *t; if(cond 2){ t = *p 2; *p 2 = *p 3; foo *p 3 = t; } return *p 2; } int main(int argc, char *argv[]){ int *x 1, *x 2, *x 3, *y 1, *y 2, *y 3; int *lp, *lq, r; cond 1 = argc-1; cond 2 = argc-2; a 1[0] = argc; a 2[0] = argc+1; a 3[0] = argc+2; x 1 = a 1; x 2 = a 2; x 3 = a 3; y 1 = a 1; y 2 = a 2; y 3 = a 3; if(cond 1){ x 1 = a 2; x 2 = a 1; } lp = foo(&x 2, &x 3); lq = foo(&y 2, &y 3); return (*lp + *lq); } Context-sensitive example P 1 x 2 x 3 a 1 a 2 a 3 y 1 y 2 y 3
#include <stdlib. h> typedef int arr[10000]; arr a 1, a 2, a 3; int cond 1, cond 2; int *foo (int **p 2, int **p 3){ int *t; if(cond 2){ t = *p 2; *p 2 = *p 3; foo *p 3 = t; } return *p 2; } int main(int argc, char *argv[]){ int *x 1, *x 2, *x 3, *y 1, *y 2, *y 3; int *lp, *lq, r; cond 1 = argc-1; cond 2 = argc-2; a 1[0] = argc; a 2[0] = argc+1; a 3[0] = argc+2; x 1 = a 1; x 2 = a 2; x 3 = a 3; y 1 = a 1; y 2 = a 2; y 3 = a 3; if(cond 1){ x 1 = a 2; x 2 = a 1; } lp = foo(&x 2, &x 3); lq = foo(&y 2, &y 3); return (*lp + *lq); } PA? Context-sensitive example t p 2 p 3 x 1 x 2 x 3 a 1 a 2 a 3 y 1 y 2 y 3
#include <stdlib. h> typedef int arr[10000]; arr a 1, a 2, a 3; int cond 1, cond 2; int *foo (int **p 2, int **p 3){ int *t; if(cond 2){ t = *p 2; *p 2 = *p 3; foo *p 3 = t; } return *p 2; } int main(int argc, char *argv[]){ int *x 1, *x 2, *x 3, *y 1, *y 2, *y 3; int *lp, *lq, r; cond 1 = argc-1; cond 2 = argc-2; a 1[0] = argc; a 2[0] = argc+1; a 3[0] = argc+2; x 1 = a 1; x 2 = a 2; x 3 = a 3; y 1 = a 1; y 2 = a 2; y 3 = a 3; if(cond 1){ x 1 = a 2; x 2 = a 1; } lp = foo(&x 2, &x 3); lq = foo(&y 2, &y 3); return (*lp + *lq); } Context-sensitive example PA t p 2 p 3 x 1 x 2 x 3 a 1 a 2 a 3 y 1 y 2 y 3
#include <stdlib. h> typedef int arr[10000]; arr a 1, a 2, a 3; int cond 1, cond 2; int *foo (int **p 2, int **p 3){ int *t; if(cond 2){ t = *p 2; *p 2 = *p 3; foo *p 3 = t; } return *p 2; } int main(int argc, char *argv[]){ int *x 1, *x 2, *x 3, *y 1, *y 2, *y 3; int *lp, *lq, r; cond 1 = argc-1; cond 2 = argc-2; a 1[0] = argc; a 2[0] = argc+1; a 3[0] = argc+2; x 1 = a 1; x 2 = a 2; x 3 = a 3; y 1 = a 1; y 2 = a 2; y 3 = a 3; if(cond 1){ x 1 = a 2; x 2 = a 1; } lp = foo(&x 2, &x 3); lq = foo(&y 2, &y 3); return (*lp + *lq); } Context-sensitive example PA’? t p 2 p 3 x 1 x 2 x 3 a 1 a 2 a 3 y 1 y 2 y 3
#include <stdlib. h> typedef int arr[10000]; arr a 1, a 2, a 3; int cond 1, cond 2; int *foo (int **p 2, int **p 3){ int *t; if(cond 2){ t = *p 2; *p 2 = *p 3; foo *p 3 = t; } return *p 2; } int main(int argc, char *argv[]){ int *x 1, *x 2, *x 3, *y 1, *y 2, *y 3; int *lp, *lq, r; cond 1 = argc-1; cond 2 = argc-2; a 1[0] = argc; a 2[0] = argc+1; a 3[0] = argc+2; x 1 = a 1; x 2 = a 2; x 3 = a 3; y 1 = a 1; y 2 = a 2; y 3 = a 3; if(cond 1){ x 1 = a 2; x 2 = a 1; } lp = foo(&x 2, &x 3); lq = foo(&y 2, &y 3); return (*lp + *lq); } Context-sensitive example PA’ t p 2 p 3 x 1 x 2 x 3 a 1 a 2 a 3 y 1 y 2 y 3
#include <stdlib. h> typedef int arr[10000]; arr a 1, a 2, a 3; int cond 1, cond 2; int *foo (int **p 2, int **p 3){ int *t; if(cond 2){ t = *p 2; *p 2 = *p 3; foo *p 3 = t; } return *p 2; } int main(int argc, char *argv[]){ int *x 1, *x 2, *x 3, *y 1, *y 2, *y 3; int *lp, *lq, r; cond 1 = argc-1; cond 2 = argc-2; a 1[0] = argc; a 2[0] = argc+1; a 3[0] = argc+2; x 1 = a 1; x 2 = a 2; x 3 = a 3; y 1 = a 1; y 2 = a 2; y 3 = a 3; if(cond 1){ x 1 = a 2; x 2 = a 1; } lp = foo(&x 2, &x 3); lq = foo(&y 2, &y 3); return (*lp + *lq); } Context-sensitive example PA”? t p 2 p 3 x 1 x 2 x 3 a 1 a 2 a 3 y 1 y 2 y 3
#include <stdlib. h> typedef int arr[10000]; arr a 1, a 2, a 3; int cond 1, cond 2; int *foo (int **p 2, int **p 3){ int *t; if(cond 2){ t = *p 2; *p 2 = *p 3; foo *p 3 = t; } return *p 2; } int main(int argc, char *argv[]){ int *x 1, *x 2, *x 3, *y 1, *y 2, *y 3; int *lp, *lq, r; cond 1 = argc-1; cond 2 = argc-2; a 1[0] = argc; a 2[0] = argc+1; a 3[0] = argc+2; x 1 = a 1; x 2 = a 2; x 3 = a 3; y 1 = a 1; y 2 = a 2; y 3 = a 3; if(cond 1){ x 1 = a 2; x 2 = a 1; } lp = foo(&x 2, &x 3); lq = foo(&y 2, &y 3); return (*lp + *lq); } Context-sensitive example PA” t p 2 p 3 x 1 x 2 x 3 a 1 a 2 a 3 y 1 y 2 y 3
#include <stdlib. h> typedef int arr[10000]; arr a 1, a 2, a 3; int cond 1, cond 2; int *foo (int **p 2, int **p 3){ int *t; if(cond 2){ t = *p 2; *p 2 = *p 3; foo *p 3 = t; } return *p 2; } int main(int argc, char *argv[]){ int *x 1, *x 2, *x 3, *y 1, *y 2, *y 3; int *lp, *lq, r; cond 1 = argc-1; cond 2 = argc-2; a 1[0] = argc; a 2[0] = argc+1; a 3[0] = argc+2; x 1 = a 1; x 2 = a 2; x 3 = a 3; y 1 = a 1; y 2 = a 2; y 3 = a 3; if(cond 1){ x 1 = a 2; x 2 = a 1; } lp = foo(&x 2, &x 3); lq = foo(&y 2, &y 3); return (*lp + *lq); } Context-sensitive example PA” t p 2 p 3 x 1 x 2 x 3 a 1 a 2 a 3 y 1 y 2 y 3
#include <stdlib. h> typedef int arr[10000]; arr a 1, a 2, a 3; int cond 1, cond 2; int *foo (int **p 2, int **p 3){ int *t; if(cond 2){ t = *p 2; *p 2 = *p 3; foo *p 3 = t; } return *p 2; } int main(int argc, char *argv[]){ int *x 1, *x 2, *x 3, *y 1, *y 2, *y 3; int *lp, *lq, r; cond 1 = argc-1; cond 2 = argc-2; a 1[0] = argc; a 2[0] = argc+1; a 3[0] = argc+2; x 1 = a 1; x 2 = a 2; x 3 = a 3; y 1 = a 1; y 2 = a 2; y 3 = a 3; if(cond 1){ x 1 = a 2; x 2 = a 1; } lp = foo(&x 2, &x 3); lq = foo(&y 2, &y 3); return (*lp + *lq); } Context-sensitive example PA’’’? t p 2 p 3 x 1 x 2 x 3 a 1 a 2 a 3 y 1 y 2 y 3
#include <stdlib. h> typedef int arr[10000]; arr a 1, a 2, a 3; int cond 1, cond 2; int *foo (int **p 2, int **p 3){ int *t; if(cond 2){ t = *p 2; *p 2 = *p 3; foo *p 3 = t; } return *p 2; } int main(int argc, char *argv[]){ int *x 1, *x 2, *x 3, *y 1, *y 2, *y 3; int *lp, *lq, r; cond 1 = argc-1; cond 2 = argc-2; a 1[0] = argc; a 2[0] = argc+1; a 3[0] = argc+2; x 1 = a 1; x 2 = a 2; x 3 = a 3; y 1 = a 1; y 2 = a 2; y 3 = a 3; if(cond 1){ x 1 = a 2; x 2 = a 1; } lp = foo(&x 2, &x 3); lq = foo(&y 2, &y 3); return (*lp + *lq); } Context-sensitive example PA’’’? t p 2 p 3 x 1 x 2 x 3 a 1 a 2 a 3 y 1 y 2 y 3
#include <stdlib. h> typedef int arr[10000]; arr a 1, a 2, a 3; int cond 1, cond 2; int *foo (int **p 2, int **p 3){ int *t; if(cond 2){ t = *p 2; *p 2 = *p 3; foo *p 3 = t; } return *p 2; } int main(int argc, char *argv[]){ int *x 1, *x 2, *x 3, *y 1, *y 2, *y 3; int *lp, *lq, r; cond 1 = argc-1; cond 2 = argc-2; a 1[0] = argc; a 2[0] = argc+1; a 3[0] = argc+2; x 1 = a 1; x 2 = a 2; x 3 = a 3; y 1 = a 1; y 2 = a 2; y 3 = a 3; if(cond 1){ x 1 = a 2; x 2 = a 1; } lp = foo(&x 2, &x 3); lq = foo(&y 2, &y 3); return (*lp + *lq); } Context-sensitive example t p 2 p 3 x 1 x 2 x 3 a 1 a 2 a 3 y 1 y 2 y 3 PB?
#include <stdlib. h> typedef int arr[10000]; arr a 1, a 2, a 3; int cond 1, cond 2; int *foo (int **p 2, int **p 3){ int *t; if(cond 2){ t = *p 2; *p 2 = *p 3; foo *p 3 = t; } return *p 2; } int main(int argc, char *argv[]){ int *x 1, *x 2, *x 3, *y 1, *y 2, *y 3; int *lp, *lq, r; cond 1 = argc-1; cond 2 = argc-2; a 1[0] = argc; a 2[0] = argc+1; a 3[0] = argc+2; x 1 = a 1; x 2 = a 2; x 3 = a 3; y 1 = a 1; y 2 = a 2; y 3 = a 3; if(cond 1){ x 1 = a 2; x 2 = a 1; } lp = foo(&x 2, &x 3); lq = foo(&y 2, &y 3); return (*lp + *lq); } Context-sensitive example t p 2 p 3 x 1 x 2 x 3 a 1 a 2 a 3 y 1 y 2 y 3 PB
#include <stdlib. h> typedef int arr[10000]; arr a 1, a 2, a 3; int cond 1, cond 2; int *foo (int **p 2, int **p 3){ int *t; if(cond 2){ t = *p 2; *p 2 = *p 3; foo *p 3 = t; } return *p 2; } int main(int argc, char *argv[]){ int *x 1, *x 2, *x 3, *y 1, *y 2, *y 3; int *lp, *lq, r; cond 1 = argc-1; cond 2 = argc-2; a 1[0] = argc; a 2[0] = argc+1; a 3[0] = argc+2; x 1 = a 1; x 2 = a 2; x 3 = a 3; y 1 = a 1; y 2 = a 2; y 3 = a 3; if(cond 1){ x 1 = a 2; x 2 = a 1; } lp = foo(&x 2, &x 3); lq = foo(&y 2, &y 3); return (*lp + *lq); } Context-sensitive example lp x 1 x 2 x 3 a 1 a 2 a 3 y 1 y 2 y 3 P 2?
#include <stdlib. h> typedef int arr[10000]; arr a 1, a 2, a 3; int cond 1, cond 2; int *foo (int **p 2, int **p 3){ int *t; if(cond 2){ t = *p 2; *p 2 = *p 3; foo *p 3 = t; } return *p 2; } int main(int argc, char *argv[]){ int *x 1, *x 2, *x 3, *y 1, *y 2, *y 3; int *lp, *lq, r; cond 1 = argc-1; cond 2 = argc-2; a 1[0] = argc; a 2[0] = argc+1; a 3[0] = argc+2; x 1 = a 1; x 2 = a 2; x 3 = a 3; y 1 = a 1; y 2 = a 2; y 3 = a 3; if(cond 1){ x 1 = a 2; x 2 = a 1; } lp = foo(&x 2, &x 3); lq = foo(&y 2, &y 3); return (*lp + *lq); } Context-sensitive example lp P 2 x 1 x 2 x 3 a 1 a 2 a 3 y 1 y 2 y 3
#include <stdlib. h> typedef int arr[10000]; arr a 1, a 2, a 3; int cond 1, cond 2; int *foo (int **p 2, int **p 3){ int *t; if(cond 2){ t = *p 2; *p 2 = *p 3; foo *p 3 = t; } return *p 2; } int main(int argc, char *argv[]){ int *x 1, *x 2, *x 3, *y 1, *y 2, *y 3; int *lp, *lq, r; cond 1 = argc-1; cond 2 = argc-2; a 1[0] = argc; a 2[0] = argc+1; a 3[0] = argc+2; x 1 = a 1; x 2 = a 2; x 3 = a 3; y 1 = a 1; y 2 = a 2; y 3 = a 3; if(cond 1){ x 1 = a 2; x 2 = a 1; } lp = foo(&x 2, &x 3); lq = foo(&y 2, &y 3); return (*lp + *lq); } PA? Context-sensitive example lp x 1 x 2 x 3 a 1 a 2 a 3 y 1 y 2 y 3 t p 2 p 3
#include <stdlib. h> typedef int arr[10000]; arr a 1, a 2, a 3; int cond 1, cond 2; int *foo (int **p 2, int **p 3){ int *t; if(cond 2){ t = *p 2; *p 2 = *p 3; foo *p 3 = t; } return *p 2; } int main(int argc, char *argv[]){ int *x 1, *x 2, *x 3, *y 1, *y 2, *y 3; int *lp, *lq, r; cond 1 = argc-1; cond 2 = argc-2; a 1[0] = argc; a 2[0] = argc+1; a 3[0] = argc+2; x 1 = a 1; x 2 = a 2; x 3 = a 3; y 1 = a 1; y 2 = a 2; y 3 = a 3; if(cond 1){ x 1 = a 2; x 2 = a 1; } lp = foo(&x 2, &x 3); lq = foo(&y 2, &y 3); return (*lp + *lq); } Context-sensitive example PA lp x 1 x 2 x 3 a 1 a 2 a 3 y 1 y 2 y 3 t p 2 p 3
#include <stdlib. h> typedef int arr[10000]; arr a 1, a 2, a 3; int cond 1, cond 2; int *foo (int **p 2, int **p 3){ int *t; if(cond 2){ t = *p 2; *p 2 = *p 3; foo *p 3 = t; } return *p 2; } int main(int argc, char *argv[]){ int *x 1, *x 2, *x 3, *y 1, *y 2, *y 3; int *lp, *lq, r; cond 1 = argc-1; cond 2 = argc-2; a 1[0] = argc; a 2[0] = argc+1; a 3[0] = argc+2; x 1 = a 1; x 2 = a 2; x 3 = a 3; y 1 = a 1; y 2 = a 2; y 3 = a 3; if(cond 1){ x 1 = a 2; x 2 = a 1; } lp = foo(&x 2, &x 3); lq = foo(&y 2, &y 3); return (*lp + *lq); } Context-sensitive example lp PB? x 1 x 2 x 3 a 1 a 2 a 3 y 1 y 2 y 3 t p 2 p 3
#include <stdlib. h> typedef int arr[10000]; arr a 1, a 2, a 3; int cond 1, cond 2; int *foo (int **p 2, int **p 3){ int *t; if(cond 2){ t = *p 2; *p 2 = *p 3; foo *p 3 = t; } return *p 2; } int main(int argc, char *argv[]){ int *x 1, *x 2, *x 3, *y 1, *y 2, *y 3; int *lp, *lq, r; cond 1 = argc-1; cond 2 = argc-2; a 1[0] = argc; a 2[0] = argc+1; a 3[0] = argc+2; x 1 = a 1; x 2 = a 2; x 3 = a 3; y 1 = a 1; y 2 = a 2; y 3 = a 3; if(cond 1){ x 1 = a 2; x 2 = a 1; } lp = foo(&x 2, &x 3); lq = foo(&y 2, &y 3); return (*lp + *lq); } Context-sensitive example lp PB x 1 x 2 x 3 a 1 a 2 a 3 y 1 y 2 y 3 t p 2 p 3
#include <stdlib. h> typedef int arr[10000]; arr a 1, a 2, a 3; int cond 1, cond 2; int *foo (int **p 2, int **p 3){ int *t; if(cond 2){ t = *p 2; *p 2 = *p 3; foo *p 3 = t; } return *p 2; } int main(int argc, char *argv[]){ int *x 1, *x 2, *x 3, *y 1, *y 2, *y 3; int *lp, *lq, r; cond 1 = argc-1; cond 2 = argc-2; a 1[0] = argc; a 2[0] = argc+1; a 3[0] = argc+2; x 1 = a 1; x 2 = a 2; x 3 = a 3; y 1 = a 1; y 2 = a 2; y 3 = a 3; if(cond 1){ x 1 = a 2; x 2 = a 1; } lp = foo(&x 2, &x 3); lq = foo(&y 2, &y 3); return (*lp + *lq); } Context-sensitive example lp x 1 x 2 x 3 a 1 a 2 a 3 y 1 y 2 y 3 P 3? lq
#include <stdlib. h> typedef int arr[10000]; arr a 1, a 2, a 3; int cond 1, cond 2; int *foo (int **p 2, int **p 3){ int *t; if(cond 2){ t = *p 2; *p 2 = *p 3; foo *p 3 = t; } return *p 2; } int main(int argc, char *argv[]){ int *x 1, *x 2, *x 3, *y 1, *y 2, *y 3; int *lp, *lq, r; cond 1 = argc-1; cond 2 = argc-2; a 1[0] = argc; a 2[0] = argc+1; a 3[0] = argc+2; x 1 = a 1; x 2 = a 2; x 3 = a 3; y 1 = a 1; y 2 = a 2; y 3 = a 3; if(cond 1){ x 1 = a 2; x 2 = a 1; } lp = foo(&x 2, &x 3); lq = foo(&y 2, &y 3); return (*lp + *lq); } Context-sensitive example lp P 3 x 1 x 2 x 3 a 1 a 2 a 3 y 1 y 2 y 3 lq
Solutions to the context-sensitive problem • Create a context for each acyclic path from the root of the call graph to the current invocation (Emami. Ghya. Hendren. PLDI 94). • Create a context for each set of “relevant” alias set on entry of procedure --- also known as partial transfer functions (PTF) (Wilson. Lam. PLDI 95) – “to answer simple queries (PTF) requires all the results to be computed. ” (Whaley. Lam. PLDI 04) (Descriptions taken from Ruf. PLDI 95)
Solutions to the context-sensitive problem (cont. ) • Tag each alias to allow a procedure to propagate only appropriate aliases to its callers: – uses aliases on entry to the enclosing procedure (Landi. Ryder. PLDI 93) – Augment summary with abstraction of call stack (Cooper 89 MSc. Thesis, Choi. Burke. Carine. Po. PL 93) • A fully context-sensitive analysis is exponential on the size of the input program --- unless the number of contexts considered is limited somehow.
Solutions to the context-sensitive problem (cont. ) • Create a clone of the method for each context (Whaley. Lam. PLDI 04) – Reports up to 5 × 1023 clones (for a Java source code analyzer called pmd). – No discussion as how results of the analysis could be used in a real compiler.
Ruf’s Evaluation of Context Sensitivity • Compares flow-sensitive CS and CI analyses – Benchmarks: • Largest benchmark has 6771 lines of code and 5435 pointer or function outputs in the analysis. • Sparse call graphs (4. 2 callers/procedure on average, 54% of procedures have a single caller) • Shallow nesting of pointer datatypes --- most pointers reference scalar datatypes. – CI finds that on average each memory operation references very few locations. – CS analysis generates 2% fewer points-to pair – CS does not affect the indirect memory references at all. Ruf. PLDI 95
Definition of a context • “A context is a static abstraction of a method invocation” – A context-sensitive analysis “distinguishes invocations if their context is different” Lhotak. Hendren. CC 06
Invocation (Context) Abstractions • call sites: the context of an invocation is the program statement from which the method was invoked. – Derived from call-string abstraction: a different approximation is computed for each distinct path in the call graph (defined by Sharir. Pnueli 1981). • receiver object: the context is the static abstraction of the object in which the method was invoked. – (defined by Milanova. Routev. Ryder. ISSTA 02) Lhotak. Hendren. CC 06
Liang. Pennings. Harrold. PASTE 05 1 -level call string sensitivity 1 class A { 2 Object f; 3 Object get() { 4 return new Object(); 5 } 6 A() { 7 this. f = this. get(); 8 } 9 static main() { 10 A a 1 = new A(); 11 A a 2 = new A(); 12 Object p = a 2. get(); 13 p. to. String(); 14 } 15 }
Liang. Pennings. Harrold. PASTE 05 Points-to Graph Using 1 -level Call String Sensitivity A: this 10 1 class A { 2 Object f; 3 Object get() { 4 return new Object(); 5 } 6 A() { 7 this. f = this. get(); 8 } 9 static main() { 10 A a 1 = new A(); 11 A a 2 = new A(); 12 Object p = a 2. get(); 13 p. to. String(); 14 } 15 } a 1 o 10 A node is a variable or instance. An edge is variable reference or an instance field reference.
Liang. Pennings. Harrold. PASTE 05 1 -level call string sensitivity Special local variable to represent return value of method get() A: this 10 a 1 get: this 7 1 class A { 2 Object f; 3 Object get() { 4 return new Object(); 5 } 6 A() { 7 this. f = this. get(); 8 } 9 static main() { 10 A a 1 = new A(); 11 A a 2 = new A(); 12 Object p = a 2. get(); 13 p. to. String(); 14 } 15 } o 10 get: ret 7 f o 7#4 The call string is limited to size one. This node represents the object allocated at line 4 because of a call to get from line 7.
Liang. Pennings. Harrold. PASTE 05 1 -level call string sensitivity A: this 10 a 1 A: this 11 a 2 get: this 7 1 class A { 2 Object f; 3 Object get() { 4 return new Object(); 5 } 6 A() { 7 this. f = this. get(); 8 } 9 static main() { 10 A a 1 = new A(); 11 A a 2 = new A(); 12 Object p = a 2. get(); 13 p. to. String(); 14 } 15 } o 10 get: ret 7 f o 7#4 o 11
Liang. Pennings. Harrold. PASTE 05 1 -level call string sensitivity A: this 10 a 1 A: this 11 a 2 get: this 7 1 class A { 2 Object f; 3 Object get() { 4 return new Object(); 5 } 6 A() { 7 this. f = this. get(); 8 } 9 static main() { 10 A a 1 = new A(); 11 A a 2 = new A(); 12 Object p = a 2. get(); 13 p. to. String(); 14 } 15 } get: this 12 o 10 get: ret 7 f o 11 f o 7#4 Cannot distinguish between the object allocation initiated by lines 10 and 11 because in both cases the new object is created at line 4 through a call from line 7.
Liang. Pennings. Harrold. PASTE 05 1 -level call string sensitivity A: this 10 a 1 A: this 11 a 2 get: this 7 1 class A { 2 Object f; 3 Object get() { 4 return new Object(); 5 } 6 A() { 7 this. f = this. get(); 8 } 9 static main() { 10 A a 1 = new A(); 11 A a 2 = new A(); 12 Object p = a 2. get(); 13 p. to. String(); 14 } 15 } get: this 12 get: ret 12 o 10 get: ret 7 f o 11 f o 7#4 p o 12#4
Liang. Pennings. Harrold. PASTE 05 1 -level context-bound receiver-object sensitivity A: this 10 1 class A { 2 Object f; 3 Object get() { 4 return new Object(); 5 } 6 A() { 7 this. f = this. get(); 8 } 9 static main() { 10 A a 1 = new A(); 11 A a 2 = new A(); 12 Object p = a 2. get(); 13 p. to. String(); 14 } 15 } a 1 o 10 A: this 10 a 1 A: this 11 a 2 get: this 7 get: this 12 get: ret 7 1 -level call string o 10 f o 11 f o 7#4 p o 12#4
Liang. Pennings. Harrold. PASTE 05 1 -level context-bound receiver-object sensitivity A: this 10 a 1 This node represented the objected created at line 4 to represent the object of line 10. The context is given by the object and is independent of the call chain to the object creation. get: this 10 1 class A { 2 Object f; 3 Object get() { 4 return new Object(); 5 } 6 A() { 7 this. f = this. get(); 8 } 9 static main() { 10 A a 1 = new A(); 11 A a 2 = new A(); 12 Object p = a 2. get(); 13 p. to. String(); 14 } 15 } o 10 get: ret 10 f o 10#4 A: this 10 a 1 A: this 11 a 2 get: this 7 get: this 12 get: ret 7 1 -level call string o 10 f o 11 f o 7#4 p o 12#4
Liang. Pennings. Harrold. PASTE 05 1 -level context-bound receiver-object sensitivity A: this 10 a 1 A: this 11 a 2 get: this 10 1 class A { 2 Object f; 3 Object get() { 4 return new Object(); 5 } 6 A() { 7 this. f = this. get(); 8 } 9 static main() { 10 A a 1 = new A(); 11 A a 2 = new A(); 12 Object p = a 2. get(); 13 p. to. String(); 14 } 15 } o 10 get: ret 10 o 11 f o 10#4 A: this 10 a 1 A: this 11 a 2 get: this 7 get: this 12 get: ret 7 1 -level call string o 10 f o 11 f o 7#4 p o 12#4
Liang. Pennings. Harrold. PASTE 05 1 -level context-bound receiver-object sensitivity A: this 10 a 1 A: this 11 a 2 get: this 10 1 class A { 2 Object f; 3 Object get() { 4 return new Object(); 5 } 6 A() { 7 this. f = this. get(); 8 } 9 static main() { 10 A a 1 = new A(); 11 A a 2 = new A(); 12 Object p = a 2. get(); 13 p. to. String(); 14 } 15 } get: this 11 o 10 get: ret 10 o 11 f f o 10#4 o 11#4 A: this 10 Now objects from lines 10 and 11 have distinct abstract representations. get: ret 11 a 1 A: this 11 a 2 get: this 7 get: this 12 get: ret 7 1 -level call string o 10 f o 11 f o 7#4 p o 12#4
Liang. Pennings. Harrold. PASTE 05 1 -level context-bound receiver-object sensitivity A: this 10 a 1 A: this 11 a 2 get: this 10 1 class A { 2 Object f; 3 Object get() { 4 return new Object(); 5 } 6 A() { 7 this. f = this. get(); 8 } 9 static main() { 10 A a 1 = new A(); 11 A a 2 = new A(); 12 Object p = a 2. get(); 13 p. to. String(); 14 } 15 } get: this 11 o 10 get: ret 10 p o 11 f get: ret 11 f o 10#4 o 11#4 A: this 10 a 1 A: this 11 a 2 get: this 7 get: this 12 get: ret 7 1 -level call string o 10 f o 11 f o 7#4 p o 12#4
Liang. Pennings. Harrold. PASTE 05 1 class A { 2 Object f; 3 Object get() { 4 return new Object(); 5 } 6 A() { 7 this. f = this. get(); 8 } 9 static main() { 10 A a 1 = new A(); 11 A a 2 = new A(); 12 Object p = a 2. get(); 13 p. to. String(); 14 } 15 } A: this 10 a 1 context insensitive a 1 A: this o 10 A: this 11 a 2 get: this 10 get: this 11 A: this 10 a 1 A: this 11 a 2 get: this 7 o 10 get: ret 10 o 11 f o 10#4 1 -level receiver object p get: ret 12 get: ret 11 get: ret 7 f get: this 12 o 11#4 1 -level call string o 10 f o 11 f o 7#4 p o 12#4
Liang. Pennings. Harrold. PASTE 05 1 class A { 2 Object f; 3 Object get() { 4 return new Object(); 5 } 6 A() { 7 this. f = this. get(); 8 } 9 static main() { 10 A a 1 = new A(); 11 A a 2 = new A(); 12 Object p = a 2. get(); 13 p. to. String(); 14 } 15 } A: this 10 a 1 context insensitive a 1 A: this get: this Without context, there is a single abstraction to represent all objects allocated at line 4. o 10 get: ret A: this 11 a 2 get: this 10 get: this 11 o 4 A: this 10 a 1 A: this 11 a 2 get: this 7 o 10 get: ret 10 o 11 f o 10#4 1 -level receiver object p get: ret 12 get: ret 11 get: ret 7 f get: this 12 o 11#4 1 -level call string o 10 f o 11 f o 7#4 p o 12#4
Liang. Pennings. Harrold. PASTE 05 1 class A { 2 Object f; 3 Object get() { 4 return new Object(); 5 } 6 A() { 7 this. f = this. get(); 8 } 9 static main() { 10 A a 1 = new A(); 11 A a 2 = new A(); 12 Object p = a 2. get(); 13 p. to. String(); 14 } 15 } A: this 10 a 1 context insensitive a 1 A: this a 2 get: this o 10 get: ret o 11 A: this 11 a 2 get: this 10 get: this 11 o 4 A: this 10 a 1 A: this 11 a 2 get: this 7 o 10 get: ret 10 o 11 f o 10#4 1 -level receiver object p get: ret 12 get: ret 11 get: ret 7 f get: this 12 o 11#4 1 -level call string o 10 f o 11 f o 7#4 p o 12#4
Liang. Pennings. Harrold. PASTE 05 1 class A { 2 Object f; 3 Object get() { 4 return new Object(); 5 } 6 A() { 7 this. f = this. get(); 8 } 9 static main() { 10 A a 1 = new A(); 11 A a 2 = new A(); 12 Object p = a 2. get(); 13 p. to. String(); 14 } 15 } A: this 10 a 1 context insensitive a 1 A: this a 2 get: this o 10 get: ret o 11 A: this 11 a 2 get: this 10 get: this 11 o 4 A: this 10 a 1 A: this 11 a 2 get: this 7 o 10 get: ret 10 o 11 f o 10#4 1 -level receiver object p get: ret 12 get: ret 11 get: ret 7 f get: this 12 o 11#4 1 -level call string o 10 f o 11 f o 7#4 p o 12#4
Liang. Pennings. Harrold. PASTE 05 1 class A { 2 Object f; 3 Object get() { 4 return new Object(); 5 } 6 A() { 7 this. f = this. get(); 8 } 9 static main() { 10 A a 1 = new A(); 11 A a 2 = new A(); 12 Object p = a 2. get(); 13 p. to. String(); 14 } 15 } A: this 10 a 1 context insensitive a 1 A: this a 2 get: this o 10 get: ret A: this 11 a 2 get: this 10 get: this 11 p o 11 o 4 A: this 10 a 1 A: this 11 a 2 get: this 7 o 10 get: ret 10 o 11 f o 10#4 1 -level receiver object p get: ret 12 get: ret 11 get: ret 7 f get: this 12 o 11#4 1 -level call string o 10 f o 11 f o 7#4 p o 12#4
Strings of Contexts • A context of a method invocation i can be defined by a context string that represents the top invocations in the stack when i is invoked. • Managing unbounded growth in the number of contexts: – k-limiting: Limit the contexts considered to k – cycle collapsing: Collapse all cycles in the contextinsensitive call graph into a single context. • Used by Zhu. Calman. PLDI 04 and Whaley. Lam. PLDI 04 Lhotak. Hendren. CC 06
Equivalent Contexts • Two contexts are equivalent if their points-to relations are the same. – The number of distinct method-context pairs indicates how worthwhile context sensitivity may be in improving precision of points-to sets. Lhotak. Hendren. CC 06
Call Site × Receiver Object Context Sensitivity 1 class A { 2 Object f; 3 Object get() { 4 return new Object(); 5 } 6 A() { 7 this. f = this. get(); 8 } 9 static main() { 10 A a 1 = new A(); 11 A a 2 = new A(); 12 Object p = a 2. get(); 13 p. to. String(); 14 } 15 } • Call-site Sensitivity: The context of an invocation is the program statement from which the method is invoked. get: ret 11 o 11#4 get: ret 12 p o 12#4 • Receiver-Object Sensitivity: The context of an invocation is the abstraction of the object on which the method is invoked. .
Call Site × Receiver Object Context Sensitivity 1 class A { 2 Object f; 3 Object get() { 4 return new Object(); 5 } 6 A() { 7 this. f = this. get(); 8 } 9 static main() { 10 A a 1 = new A(); 11 A a 2 = new A(); 12 Object p = a 2. get(); 13 p. to. String(); 14 } 15 } get: ret 12 p (c 7, o 11)#4 Using a limit of 1 for string and object. • Hibrid Sensitivity: The context of an invocation is the abstraction of both the call site and the object on which the method is invoked. .
Is CS (in Java) Worth It? (number of contexts) Distinct Total Bench Insen sitive Object-sensitivity Call-site string 1 2 3 1 H 1 2 compress 2597 13. 7 113 1517 13. 4 6. 5 jess 3216 19. 0 305 5394 18. 6 jython 4401 18. 8 384 - compress 2597 8. 4 9. 9 jess 3216 8. 9 jython 4402 9. 9 Collap. Cycles 1 H max k 237 6. 5 2. 9× 104 21 6. 7 207 6. 1× 106 24 18. 3 6. 7 162 6. 7 2. 1× 1015 72 11. 3 12. 1 2. 4 3. 9 4. 9 3. 3 10. 6 12. 0 13. 9 2. 6 4. 2 5. 0 3. 9 55. 9 - 15. 6 2. 5 4. 3 4. 6 4. 0 Total # of contexts is the product of the number in the column by the number of methods. Insensitive: 1 context per method 1, 2, 3: Pointers are context-sensitive but pointer targets are not. 1 H: both pointers and pointer targets modeled with context strings of maximum length 1. Lhotak. Hendren. CC 06
Is CS (in Java) Worth It? (number of contexts) Distinct Total Bench Insen s Object-sensitivity Call-site string 1 2 3 1 H 1 2 compress 2597 13. 7 113 1517 13. 4 6. 5 jess 3216 19. 0 305 5394 18. 6 jython 4401 18. 8 384 - compress 2597 8. 4 9. 9 jess 3216 8. 9 jython 4402 9. 9 Collap. Cycles 1 H max k 237 6. 5 2. 9× 104 21 6. 7 207 6. 1× 106 24 18. 3 6. 7 162 6. 7 2. 1× 1015 72 11. 3 12. 1 2. 4 3. 9 4. 9 3. 3 10. 6 12. 0 13. 9 2. 6 4. 2 5. 0 3. 9 55. 9 - 15. 6 2. 5 4. 3 4. 6 4. 0 • Large number of contexts, but fewer that are distinct. • Collapsing cycles models large parts of the call graph context-insensitively. Lhotak. Hendren. CC 06
Is CS (in Java) Worth It? (Virtual Call Resolution) Bench CHA Insen s Object-sensitivity Call-site string 1 2 3 1 H 1 2 1 H javac 908 737 720 720 soot-c 1748 983 913 913 938 polyglot 1332 744 592 592 585 592 592 bloat 2503 1079 962 - 961 962 962 pmd 2868 1224 1193 1163 1205 • Number of potentially polymorphic call sites (non library code). Lhotak. Hendren. CC 06
Application of CS in Java (Cast Safety) • A casting can potentially fail if the analysis cannot prove statically that the new type is a supertype of the original type. • Cast Safety Analysis determines which casts cannot fail. • A 1 H analysis reduced the number of potentially failing castings from 3539 to 1017 in the polyglot benchmark. Lhotak. Hendren. CC 06
Is CS (in Java) Worth It? (Lhotak-Hendren Conclusions) • CS slightly improves call graph precision. • CS yields a more significant improvement in virtual call resolution. • A 1 -object-sensitive or a 1 H-object-sensitive analysis seems to be the best tradeoff. • Extending the length of context strings in an object-sensitive analysis has little benefits. • Collapsing cycles in the call graph is not a good idea for Java. Lhotak. Hendren. CC 06
Liang/Harrold Evaluate CS on Andersen’s Analysis for Java • CS results in more precise reference information in some benchmarks. – Both call-string contexts and receiver contexts are useful (in different benchmarks). – In some benchmarks CS makes no difference • They use precise models to simulate collection and map classes. Liang. Pennings. Harrols. PASTE 05
k-limiting object names 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 typedef struct CELL{ int number; struct CELL *next; } cell; cell *head; int Find. Max(cell *cursor) { int local_max; if(cursor == NULL) return 0; local_max = cursor->number; for( ; cursor->next != NULL ; cursor = cursor->next) { if(cursor->number > local_max) local_max = cursor->number; } return local_max; } • In the code on the left the number of object names (shadows) is unbounded. • Landi and Ryder limit the number of shadows to k. • All object names with more then k dereferences are represented by the same name (shadow). Landi. Ryder. PLDI 92
Heap Cloning • For each procedure, create a graphical representation of the heap objects that are manipulated by the procedure (allocated, assigned to, referenced, etc) • Traverse the call graph cloning the graph of the callee into each call site. Lattner. Lnhart. Adve. PLDI 07
Dealing with Cloning Complexity • Use unification-based analysis so that many clones are merged together; • Do not clone unreacheable objects from a callee into a caller; – For example, objects whose scope is entirely within the callee are not cloned; • Merge (instead of cloning) global variables; Lattner. Ph. D 05
Recursive functions • Abandon context-sensitivity in strongly connected components of the call graph. – Merge the graphs for all functions in the SCC Lattner. Ph. D 05
Heap Specialization • Heap specialization: clone heap objects along call chains (paths in the call graph). • Nystrom et al. propose that only heap objects that escape the callee need to be cloned. • They observe, empirically, that if the only exposure of an escaped object is through a global variable, there is no benefit for cloning. • Their analysis is flow-insensitive, Anderson style. Nystrom. Kim. Hwu. PASTE 04
Demand-driven Pointer Analysis • Aimed to JITs. Only analyze portions of the program relevant to queries. • 90% precision of field-sensitive Andersen’s analysis within 2 ms per query (OOPSLA 05). Sridharan. Gopan. Shan. Bodik. OOPSLA 05
Incremental/Compositional/Partial Pointer/Escape Analysis for Java • Generate parameterized analysis results for each method. – Recursive methods use a fix-point iterative algorithm. – Analyze each method independent of its caller. – Trade precision X time: can analyze a method without analyzing all the methods that it invokes. – Function summaries are flow insensitive. – Based on “points-to escape graphs”: • (inside nodes/edges, outside nodes/edges, return value) • Slow. Complexity of O(N 10) where N is the number of instructions in the scope of the analysis: – compress is 3 times slower to compile with the analysis. Vivien. Rinard. PLDI 01, Whaley. Rinard. OOPSLA 99, Salcianu. Ph. DMIT 01
On-demand Incremental Region-based Shape Analysis for C • Main idea: break down the abstraction into smaller components and analyze each component separately. – Use a “cheap” flow-insensitive and context-sensitive pointer analysis to partition the memory into disjoint regions. Each node in the points to graph represents a “memory region”. • Regions must be disjoint • Interprocedural propagation: uses a pair of input/output transfer functions for each function. • On-demand: Can limit inter-procedural propagations to a set of regions. • Incremental: Can reuse results from previously analyzed regions. • Analyze Open. SSH (18. 6 Kloc) in 45 seconds. Hacket. Rugina. POPL 05
Refinement-Based On-Demand CS pointsto analysis for Java • Based on Context-Free-Language (CFL)-reachability. – the CF language L represents paths in the program that might cause a variable to point to an abstract location. – Balanced parenthesis property filters out unrealizable paths: • call/return pairs must match • In Java store/loads to fields must also match (the same is not true for C). – Significant increase of precision in relation to contextinsensitive analysis. – 13 minutes to analyze polyglot Sridharan. Bodik. POPL 05
References • • • M. Sharir and A. Pnuelli, “Two approaches to interprocedural data flow analysis, ” in Program Flow Analysis: Theory and Applications. Englewood Cliffs, NJ: Prentice-Hall, 1981, pp. 189 -234. S. Horwitz, T. Reps, D. Binkley, “Interprocedural slicing using dependence graphs, ” TOPLAS 1990 12(1): 26 -60. W. Landi, B. G. Ryder, “A safe approximate algorithm for interprocedural aliasing, ” PLDI 1992, pp. 235 -248. M. Emami, R. Ghiya, L. J. Hendren, “Context-sensitive interprocedural points-to analysis in the presence of function pointers. ” PLDI 1994, pp. 242 -246. J. -D. Choi, R. Cytron, J. Ferrante, “On the Efficient Engineering of Ambitious Program Analysis, ” IEEE Trans. on Soft. Enginer, Vol. 20, No. 2, Feb, 1994, pp, 105 -114 – • • Describes Factored SSA (FSSA). E. Ruf , “Context-insensitive alias analysis reconsidered, ” PLDI 95, pp 13 -22. R. P. Wilson, M. S. Lam, “Efficient context-sensitive pointer analysis for C programs , ” PLDI 1995, pp. 1 -12. – Partial Transfer Functions are not practical.
References • • J. Whaley, M. Rinard, “Compositional Pointer and Escape Analysis for Java Programs, ” OOPSLA 99, pp. 187 -206. M. Fähndrich, J. Rehof, M. Das, “Scalable context-sensitive flow analysis using instantiation constraints, ” PLDI 2000, 253 -263. – Based exclusively on types. Unification-based in the intra-procedural level. • • • F. Vivien, M. Rinard, “Incrementalized Pointer and Escape Analysis, ” PLDI 2001, pp. 35 -46, M. Berndl, O. Lhoták, F. Qian, L. Hendren, N. Umanee, “Points-to analysis using BDDs, ” PLDI 2003, pp. 103 -114. J. Zhu, S. Calman, “Symbolic pointer analysis revisited, “ PLDI 2004, pp. 145 -157. – Treats call-graph cycles context-insensitively --- loses precision in Java • J. Whaley, M. S. Lam, “Cloning-based context-sensitive pointer alias analysis using binary decision diagrams, ” PLDI 2004, pp. 131 -144. – Treats call-graph cycles context-insensitively --- loses precision in Java (lots of contexts --- no practical way to use them).
References • • • E. M. Nystrom, H. -S. Kim, W. W. Hwu, “Importance of Heap Specialization in Pointer Analysis, ” PASTE 2004, pp. 43 -48. D. Liang, M. Pennings, M. J. Harrold, “Evaluating the Impact of Context-Sensitivity on Andersen’s Algorithm for Java Programs, ” PASTE 05, pp. 6 -12. B. Hackett, R. Rugina, “Region-Based Shape Analysis with Tracked Locations, ” POPL 05, pp. 310 -323. M. Sridharan, D. Gopan, L. Shan, R. Bodik, “Demand-Driven Points-to Analysis for Java, ” OOPSLA 05, 59 -76 O. Lhoták, L. Hendren, “Context-Sensitive Points-to Analysis: Is It Worth It? , ” Compiler Construction 2006, pp. 47 -64. C. Lattner, A. Lenharth, V. Adve, “Making context-sensitive points-to analysis with heap cloning practical for the real world, ” PLDI 2007, pp. 278 – 289.
- Slides: 92