A ContextSensitive Pointer Analysis Phase in Open 64

  • Slides: 21
Download presentation
A Context-Sensitive Pointer Analysis Phase in Open 64 Compiler Tianwei Sheng, Wenguang Chen, Weimin

A Context-Sensitive Pointer Analysis Phase in Open 64 Compiler Tianwei Sheng, Wenguang Chen, Weimin Zheng Tsinghua University

outline l l l l l Motivation Definitions & State of the Art Summary

outline l l l l l Motivation Definitions & State of the Art Summary High Level Design Context-Sensitive Pointer Analysis Field-Sensitive Pointer Analysis Global Variable Graph Preliminary Experiment Results Future Work Conclusion 2

Motivation l l Very important for compiler and program analysis tools Many inter-procedural analysis

Motivation l l Very important for compiler and program analysis tools Many inter-procedural analysis phases need alias information Ø Ø Ø l l Lock-set Based Static Data Race Detection Mod-Ref Analysis Data Layout More precise information clients get, more accurate results the clients can produce In open 64, still based on Steensgaard’ 95 algorithm

Examples void foo(int index, int *p, int *q) { int i; for(i = 0;

Examples void foo(int index, int *p, int *q) { int i; for(i = 0; i < index; i++) { *(p + i) = *q; } } (a) pthread_mutex_t lock 1; pthread_mutex_t lock 2; int x, y; void foo(int*p, int *q) { pthread_mutex_lock(&lock 1); *p = 1; pthread_mutex_unlock(&lock 1); *q = 2; } void bar() { p = &x; q = &y; foo(p, q); } (b)

Definitions & State of the Art Summary Defintions: ØContext-Sensitive: consider the calling context of

Definitions & State of the Art Summary Defintions: ØContext-Sensitive: consider the calling context of each callsites ØField-Sensitive: distinguish the individual member of structure variables ØEquality/Subset: methods to model the pointer assignments, also known as Steensgaard and Andersen methods ØHeap cloning: be able to distinguish the memory location on the heap State of the Art: Algorithm Lattner Eq/Sub/Reach Eq CS CG X CS heap FS/FI X FS lanuage Shown to Scale C/C++ X Steensgaard(CS) Eq X X FS C/C++ X Scridharan Ben Zheng Zhu/Whaley Fahandrich Wilson Liang Guyer X X FS FS FI FS Java C/C++ C/C++ X X X Reach Sub Eq Sub X X X FS FS Notes PLDI‘ 07 Microsoft’ latest progress PLDI‘ 06 PLDI’ 07 POPL‘ 08 PLDI’ 04 PLDI‘ 00 PLDI’ 95 SAS‘ 01 SAS‘ 03

High Level Design(1) (a): IPA framework (b): existing alias analysis framework Problem: ØIPA phases

High Level Design(1) (a): IPA framework (b): existing alias analysis framework Problem: ØIPA phases can not make use of any pointer analysis results ØThe existing algorithm is context-insensitive, field-insensitive

High Level Design(2) l The new phase: after Call Graph Construction and DFE(Dead Function

High Level Design(2) l The new phase: after Call Graph Construction and DFE(Dead Function Elimination) Ø Ø l Use IPA_Node to read all WHIRL tree DFE can use simple address taken techniques to eliminate dead functions. Major components Ø Ø Ø Local Phase: apply a new local phase to read all WHIRL tree and create the local alias graph Bottom-Up Phase: Major context-sensitive phase, use cloning to inline the pointer information of callee into caller Top-Down Phase: Incorporate caller’s information into callee and eliminate the incomplete information in callee due to formal parameters

High Level Design(3) Call Graph Construction and DFE IPA New Context Sensitive Pointer Analysis

High Level Design(3) Call Graph Construction and DFE IPA New Context Sensitive Pointer Analysis Phase Other Analysis Phases IPO The Inter-procedural Pointer Information will be passed into Intra Phase Intra-Procedural Phase

Context-Sensitive Pointer Analysis int foo(int **p, int*q) { *p = q; *q = 1;

Context-Sensitive Pointer Analysis int foo(int **p, int*q) { *p = q; *q = 1; } int main() { int b, y; int *a, *t; foo(&a, &b); foo(&t, &y); printf(“b = %dn", b); printf("y = %dn", y); return 0; } Context-sensitive: p 1 = &a; p 2 = &t; q 1 = &b; q 2 = &y; *p 1 = q 1 *p 2 = q 2 *q 1 = 1; *q 2 = 1; Solution: a->{b}, t->{y} Context-insensitive: p = &a; p = &t; q = &b; q = &y; *p = q *q = 1 Solution: s->{b, y}, t->{b, y}

Context-Sensitive (continued) l l We use the Cloning(Lattner, PLDI’ 07) methods to achieve context-sensitivity

Context-Sensitive (continued) l l We use the Cloning(Lattner, PLDI’ 07) methods to achieve context-sensitivity Basic Constructs: Ø Ø Ø Alias Node: denote the memory location for variables Alias Rep: Unify several Alias Nodes into an Alias Rep (Unification Methods) Alias Graph: The points-to relations set, where the edges denotes the points-to relations

Context-Sensitive (continued) typedef struct { int a; int *s; }STR; int x; STR str;

Context-Sensitive (continued) typedef struct { int a; int *s; }STR; int x; STR str; extern STR* bar(STR*); STR* foo(int **q) { STR* str_p; int *p; p = &x; str_p = bar(&str); str_p->s = p; *q = p; return str_p; }

Context-Sensitive:detailed algorithm l Local Phase Ø l Traverse every PU , visit each statements

Context-Sensitive:detailed algorithm l Local Phase Ø l Traverse every PU , visit each statements in the PU and create the local alias graph(callsite, return value, formal param information). The only phase which inspects the IR Bottom-Up Phase Ø Ø Inline Callee’s Alias Graph into Caller. Only copy reachable nodes into caller since the scope of local variable in callee does not include caller Use a SCC (Strong Connect Component) detection algorithm to visit all functions, and for recursive functions that form a SCC, Merge all function inside the SCC into a single alias graph. Treat function pointers as normal pointers, and resolve them during the SCC visit algorithm, update the caller graph on the fly For global variables, create the global variables graph to hold all global variables in the program, update the local graph and global graph interactively.

Context-Sensitive (continued) int foo(int **p, int*q) { *p = q; *q = 1; }

Context-Sensitive (continued) int foo(int **p, int*q) { *p = q; *q = 1; } int main() { int b, y; int *a, *t; foo(&a, &b); foo(&t, &y); printf(“b = %dn", b); printf("y = %dn", y); return 0; (a) : Local graph for main (b) : Local graph for foo } (c) : BU graph for main

Why Top-Down Phase? 1. After Bottom-Up phase, foo’s alias graph is copied into the

Why Top-Down Phase? 1. After Bottom-Up phase, foo’s alias graph is copied into the bar. 2. Without inlining, we still do not know the alias information for the formal parameter of foo, p and q 3. The top-down phase will copy the a and x into foo, and make p and q point-to respectively

Field-Sensitive Pointer Analysis l l an alias node for each member of structures variables,

Field-Sensitive Pointer Analysis l l an alias node for each member of structures variables, collapse if consistent access pattern disable/enable field-sensitive, field number threshold Problems: in C/C++, we can take the address of the field member, we solve this problem through matching the type information for ADD’s parent WHIRL node struct List{ struct List* forward; struct List* backward; }; struct Hosp{ int c; struct List waiting; struct List assess; }; struct Hosp* hosp_p = &hosp; add. List(&hosp_p->assess); add. List_2(&(hosp_p->assess. forward)); LOC 1 26 add. List(&hosp_p->assess); U 8 U 8 LDID 0 <2, 1, hosp_p> T<53, anon_ptr. , 8> U 8 INTCONST 24 (0 x 18) U 8 ADD U 8 PARM 2 T<46, anon_ptr. , 8> # by_value VCALL 126 <1, 41, add. List> # flags 0 x 7 e LOC 1 27 add. List_2(&(hosp_p->assess. forward)); U 8 U 8 LDID 0 <2, 1, hosp_p> T<53, anon_ptr. , 8> U 8 INTCONST 24 (0 x 18) U 8 ADD U 8 PARM 2 T<50, anon_ptr. , 8> # by_value VCALL 126 <1, 43, add. List_2> # flags 0 x 7 e

Global Variable Graph(1) l l l To overcome the bottleneck of copying all global

Global Variable Graph(1) l l l To overcome the bottleneck of copying all global variables during the bottom-up and top-down phase holds all information about global variable update the local graph and global variable graph interactively: Ø Ø Ø During the local phase, when Visiting a PU, create a global variable alias node in global graph if it is referenced in this PU During the bottom-up phase, before inlining any callee, first update the local graph according to the global graph During the bottom-up phase, after inlining all callees, update the global graph

Global Variable Graph(2) Before inline, what p pointsto? s->p p->x s->p->x, q->y Steps: 1.

Global Variable Graph(2) Before inline, what p pointsto? s->p p->x s->p->x, q->y Steps: 1. Local phase: s->p, p->x, q->y, in global graph: s->p->x, q->y 2. Bottom-up phase, when inline bar’s information into bar_1, without global graph, we get p->y. With global graph, we know that p also points-to x, then we get p->{x, y}, finally update the global graph

Experimental Results alias node number Statistics : The preliminary results show that: ØThe bottom-up

Experimental Results alias node number Statistics : The preliminary results show that: ØThe bottom-up phase does not increase the alias node very much, this is consistent with the fact that we only do reachable cloning for non-local variables Ø Original Field-Sensitive will incurs very large overhead if the benchmarks contain large number of structure variables, such as 177. mesa and mysql

Future Work l l l Design the alias query interface How to pass the

Future Work l l l Design the alias query interface How to pass the alias information into intraprocedural efficiently (do not store all alias graph) Design and Implement client optimizations, such as mod-ref, race detection Compare with other algorithm, both scalability and precision Combined with Subset methods to further improve the precision

Conclusion l l l We design and implement a new cloning based context-sensitive pointer

Conclusion l l l We design and implement a new cloning based context-sensitive pointer analysis phase in Open 64 Compiler The algorithm is based on DSA algorithm in LLVM, we do several tradeoffs due to the IR and framework of Open 64 For all the benchmarks we studied, the cloning phase does not increase the alias node very much

Thanks very much !

Thanks very much !