Constructing Accurate Application Call Graphs For Java To

  • Slides: 11
Download presentation
Constructing Accurate Application Call Graphs For Java To Model Library Callbacks Weilei Zhang, Barbara

Constructing Accurate Application Call Graphs For Java To Model Library Callbacks Weilei Zhang, Barbara Ryder Department of Computer Science Rutgers University

Motivation ● ● ● Call graphs (CG) are widely used in software engineering and

Motivation ● ● ● Call graphs (CG) are widely used in software engineering and compiler optimisation Reference analysis is used to generate CG for object-oriented programs A precise reference analysis requires a whole-program analysis, e. g. NCFA, object sensitive analysis The constructed CG includes both application and library methods as its nodes (whole-program CG) In many software engineering applications, the calling relationships among library methods are NOT as interesting or important Prolangs Lab, Rutgers University 2

Application Call Graph ● ● ● Application CG represents calling relationships between application methods

Application Call Graph ● ● ● Application CG represents calling relationships between application methods Two kinds of edges in application CG: direct: application method x may call y directly callback: x may call a library method that may eventually call back y through method calls in the library x→ l 1 → l 2 →. . . → ln → y (li : a library method) Comparison: whole-program CG vs. application CG Whole-program CG is inaccurate in representing callback relationship between application methods Whole-program CG may contain too much information ● Library becomes larger and larger in modern software Prolangs Lab, Rutgers University 3

“Library” in Multi-tier Software Architecture ● Application Library. Middleware can be overwhelmingly large in

“Library” in Multi-tier Software Architecture ● Application Library. Middleware can be overwhelmingly large in modern software compared to the application program under Java EE Platform investigation. Application CG is more manageable Java SE Platform than whole-program CG. Where to draw the line for java library? Narrow: java SE(EE) libraries, as used in the experimental study of this paper Broad: in a multi-tier software architecture, the subprograms in lower tiers can be considered as libraries to the upper tiers Prolangs Lab, Rutgers University 4

Whole-program CG Is Inaccurate in Representing Callback Relationship class App{ String. Buffer local; String.

Whole-program CG Is Inaccurate in Representing Callback Relationship class App{ String. Buffer local; String. Buffer append. A(){ A a=new A(); return (local. append(a)); } String. Buffer append. B(){ B b=new B(); return (local. append(b)); } } class A{ String to. String(){. . . } } class B{ String to. String(){. . . } } Library Calls Prolangs Lab, Rutgers University 5

Whole-program CG Is Inaccurate in Representing Callback Relationship class App{ String. Buffer local; String.

Whole-program CG Is Inaccurate in Representing Callback Relationship class App{ String. Buffer local; String. Buffer append. A(){ A a=new A(); return (local. append(a)); } String. Buffer append. B(){ B b=new B(); return (local. append(b)); } } class A{ String to. String(){. . . } } class B{ String to. String(){. . . } } App. append. A() String. Buffer. append(Object) Library String. value. Of(Object) A. to. String() App. append. A() callback A. to. String() Prolangs Lab, Rutgers University 6

Whole-program CG Is Inaccurate in Representing Callback Relationship class App{ String. Buffer local; String.

Whole-program CG Is Inaccurate in Representing Callback Relationship class App{ String. Buffer local; String. Buffer append. A(){ A a=new A(); return (local. append(a)); } String. Buffer append. B(){ B b=new B(); return (local. append(b)); } } class A{ String to. String(){. . . } } class B{ String to. String(){. . . } } App. append. B( ) String. Buffer. append(Object) Library String. value. Of(Object) B. to. String() App. append. B( ) callback B. to. String() Prolangs Lab, Rutgers University 7

Whole-program CG Is Inaccurate in Representing Callback Relationship class App{ String. Buffer local; String.

Whole-program CG Is Inaccurate in Representing Callback Relationship class App{ String. Buffer local; String. Buffer append. A(){ A a=new A(); return (local. append(a)); } String. Buffer append. B(){ B b=new B(); return (local. append(b)); } } class A{ String to. String(){. . . } } class B{ String to. String(){. . . } } App. append. A() App. append. B( ) String. Buffer. append(Object) Library String. value. Of(Object) A. to. String() App. append. A() callback A. to. String() Prolangs Lab, Rutgers University B. to. String() App. append. B( ) callback B. to. String() 8

Va-Data. Reachft to Resolve Callbacks Accurately and Automatically ● Data. Reach(Fu et al. TSE

Va-Data. Reachft to Resolve Callbacks Accurately and Automatically ● Data. Reach(Fu et al. TSE 2005): to resolve control-flow (call path) reachability via data reachability analysis ● Call paths requiring receiver objects of a specific type can be shown to be infeasible, if those types of objects are not reachable through dereferences at the relevant call site. Va-Data. Reachft(SCAM 2006): a call-site specific reference analysis and escape analysis are performed at the same time for each library call to derive callbacks automatically App. append. A() append(a) App. append. B( ) append(b) String. Buffer. append(Object) Library String. value. Of(Object) Object. to. String() App. append. A() callback A. to. String() SCAM’ 06 B. to. String() App. append. B( ) callback B. to. String()

Contributions ● ● A new variant of the data reachability algorithm is proposed (Va.

Contributions ● ● A new variant of the data reachability algorithm is proposed (Va. Data. Reach) and fine tuned specifically to resolve library callbacks accurately (Va-Data. Reachft) The algorithm is implemented and experimented. The experimental study shows that the algorithm is practical and eliminates a large amount of spurious callback edges For all spec-jvm 98 benchmarks, on average, the number of callback edges is reduced from whole program call graph (generated by a pointsto analysis) by 74. 97% The algorithm finishes in reasonable time (the longest is 781 seconds, for javac, which has 2432 library calls) Prolangs Lab, Rutgers University 10

Thanks! Prolangs Lab, Rutgers University 11

Thanks! Prolangs Lab, Rutgers University 11