The LLVM Compiler Framework and Infrastructure Program Analysis

  • Slides: 43
Download presentation
The LLVM Compiler Framework and Infrastructure Program Analysis Original Slides by David Koes (CMU)

The LLVM Compiler Framework and Infrastructure Program Analysis Original Slides by David Koes (CMU) Substantial portions courtesy Chris Lattner and Vikram Adve

LLVM Compiler System n The LLVM Compiler Infrastructure v Provides reusable components for building

LLVM Compiler System n The LLVM Compiler Infrastructure v Provides reusable components for building compilers v Reduces the time/cost to build a new compiler v Build static compilers, JITs, trace-based optimizers, . . . n The LLVM Compiler Framework v End-to-end compilers using the LLVM infrastructure v C and C++ gcc frontend v Backends for C, X 86, Sparc, Power. PC, Alpha, Arm, Thumb, IA-64… 2

Three primary LLVM components n The LLVM Virtual Instruction Set v The common language-

Three primary LLVM components n The LLVM Virtual Instruction Set v The common language- and target-independent IR v Internal (IR) and external (persistent) representation n A collection of well-integrated libraries v Analyses, optimizations, code generators, JIT compiler, garbage collection support, profiling, … n A collection of tools built from the libraries v Assemblers, automatic debugger, linker, code generator, compiler driver, modular optimizer, … 3

Tutorial Overview n n Introduction to the running example LLVM C/C++ Compiler Overview v

Tutorial Overview n n Introduction to the running example LLVM C/C++ Compiler Overview v High-level view of an example LLVM compiler n The LLVM Virtual Instruction Set v IR overview and type-system n LLVM C++ IR and important API’s v Basics, Pass. Manager, dataflow, Arg. Promotion n Alias Analysis in LLVM 4

Running example: arg promotion Consider use of by-reference parameters: int callee(const int &X) {

Running example: arg promotion Consider use of by-reference parameters: int callee(const int &X) { return X+1; } int caller() { return callee(4); } We want: int callee(int X) { return X+1; } int caller() { return callee(4); } compiles to int callee(const int *X) { return *X+1; // memory load } int caller() { int tmp; // stack object tmp = 4; // memory store return callee(&tmp); } üEliminated load in callee üEliminated store in caller üEliminated stack slot for ‘tmp’ 5

Why is this hard? n Requires interprocedural analysis: v Must change the prototype of

Why is this hard? n Requires interprocedural analysis: v Must change the prototype of the callee v Must update all call sites we must know all callers v What about callers outside the translation unit? n Requires alias analysis: v Reference could alias other pointers in callee v Must know that loaded value doesn’t change from function entry to the load v Must know the pointer is not being stored through n Reference might not be to a stack object! 6

Tutorial Overview n n Introduction to the running example LLVM C/C++ Compiler Overview v

Tutorial Overview n n Introduction to the running example LLVM C/C++ Compiler Overview v High-level view of an example LLVM compiler n The LLVM Virtual Instruction Set v IR overview and type-system n LLVM C++ IR and important API’s v Basics, Pass. Manager, dataflow, Arg. Promotion n Alias Analysis in LLVM 7

The LLVM C/C++ Compiler n From the high level, it is a standard compiler:

The LLVM C/C++ Compiler n From the high level, it is a standard compiler: v Compatible with standard makefiles v Uses GCC 4. 2 C and C++ parser v Generates native executables/object files/assembly n Distinguishing features: v Uses LLVM optimizers, not GCC optimizers v Pass -emit-llvm to output LLVM IR n -S: human readable “assembly” n -c: efficient “bitcode” binary 8

Looking into events at compile-time C/C++ file IR GENERIC llvm-gcc/llvm-g++ -O -S GIMPLE (tree-ssa)

Looking into events at compile-time C/C++ file IR GENERIC llvm-gcc/llvm-g++ -O -S GIMPLE (tree-ssa) LLVM IR assembly Machine Code IR -emit-llvm LLVM asm >50 LLVM Analysis & Optimization Passes: Dead Global Elimination, IP Constant Propagation, Dead Argument Elimination, Inlining, Reassociation, LICM, Loop Opts, Memory Promotion, Dead Store Elimination, ADCE, … 9

Looking into events at link-time LLVM bitcode. o file LLVM Linker llvm-ld Link-time Optimizer

Looking into events at link-time LLVM bitcode. o file LLVM Linker llvm-ld Link-time Optimizer executable . bc file for LLVM JIT -native Native executable >30 LLVM Analysis & Optimization Passes Optionally “internalizes”: marks most functions as internal, to improve IPO Perfect place for argument promotion optimization! 10

Goals of the compiler design n Analyze and optimize as early as possible: v

Goals of the compiler design n Analyze and optimize as early as possible: v Compile-time opts reduce modify-rebuild-execute cycle v Compile-time optimizations reduce work at link-time (by shrinking the program) n All IPA/IPO make an open-world assumption v Thus, they all work on libraries and at compile-time v “Internalize” pass enables “whole program” optzn n One IR (without lowering) for analysis & optzn v Compile-time optzns can be run at link-time too! v The same IR is used as input to the JIT IR design is the key to these goals! 11

Tutorial Overview n n Introduction to the running example LLVM C/C++ Compiler Overview v

Tutorial Overview n n Introduction to the running example LLVM C/C++ Compiler Overview v High-level view of an example LLVM compiler n The LLVM Virtual Instruction Set v IR overview and type-system n LLVM C++ IR and important API’s v Basics, Pass. Manager, dataflow, Arg. Promotion n Alias Analysis in LLVM 12

Goals of LLVM IR n n Easy to produce, understand, and define! Language- and

Goals of LLVM IR n n Easy to produce, understand, and define! Language- and Target-Independent v AST-level IR (e. g. ANDF, UNCOL) is not very feasible n Every analysis/xform must know about ‘all’ languages n One IR for analysis and optimization v IR must be able to support aggressive IPO, loop opts, scalar opts, … high- and low-level optimization! n Optimize as much as early as possible v Can’t postpone everything until link or runtime v No lowering in the IR! 13

LLVM Instruction Set Overview #1 n Low-level and target-independent semantics v RISC-like three address

LLVM Instruction Set Overview #1 n Low-level and target-independent semantics v RISC-like three address code v Infinite virtual register set in SSA form v Simple, low-level control flow constructs v Load/store instructions with typed-pointers n IR has text, binary, and in-memory forms bb: ; preds = %bb, %entry %i. 1 = phi i 32 [ 0, %entry ], [ %i. 2, %bb ] %Ai. Addr = getelementptr float* %A, i 32 %i. 1 for (i = 0; i < N; call void @Sum( float* %Ai. Addr, %pair* %P ) %i. 2 = add i 32 %i. 1, 1 ++i) %exitcond = icmp eq i 32 %i. 2, %N Sum(&A[i], &P); br i 1 %exitcond, label %return, label %bb 14

LLVM Instruction Set Overview #2 n High-level information exposed in the code v Explicit

LLVM Instruction Set Overview #2 n High-level information exposed in the code v Explicit dataflow through SSA form v Explicit control-flow graph (even for exceptions) v Explicit language-independent type-information v Explicit typed pointer arithmetic n Preserve array subscript and structure indexing bb: ; preds = %bb, %entry %i. 1 = phi i 32 [ 0, %entry ], [ %i. 2, %bb ] %Ai. Addr = getelementptr float* %A, i 32 %i. 1 for (i = 0; i < N; call void @Sum( float* %Ai. Addr, %pair* %P ) %i. 2 = add i 32 %i. 1, 1 ++i) %exitcond = icmp eq i 32 %i. 2, %N Sum(&A[i], &P); br i 1 %exitcond, label %return, label %bb 15

LLVM Type System Details n The entire type system consists of: v Primitives: integer,

LLVM Type System Details n The entire type system consists of: v Primitives: integer, floating point, label, void n no “signed” integer types n arbitrary bitwidth integers (i 32, i 64, i 1) v Derived: pointer, array, structure, function, vector, … v No high-level types: type-system is language neutral! n Type system allows arbitrary casts: v Allows expressing weakly-typed languages, like C v Front-ends can implement safe languages v Also easy to define a type-safe subset of LLVM See also: docs/Lang. Ref. html 16

Lowering source-level types to LLVM n Source language types are lowered: v Rich type

Lowering source-level types to LLVM n Source language types are lowered: v Rich type systems expanded to simple type system v Implicit & abstract types are made explicit & concrete n Examples of lowering: v References turn into pointers: T& T* v Complex numbers: complex float { float, float } v Bitfields: struct X { int Y: 4; int Z: 2; } { i 32 } v Inheritance: class T : S { int X; } { S, i 32 } v Methods: class T { void foo(); } void foo(T*) n Same idea as lowering to machine code 17

LLVM Program Structure n Module contains Functions/Global. Variables v Module is unit of compilation/analysis/optimization

LLVM Program Structure n Module contains Functions/Global. Variables v Module is unit of compilation/analysis/optimization n Function contains Basic. Blocks/Arguments v Functions roughly correspond to functions in C n Basic. Block contains list of instructions v Each block ends in a control flow instruction n Instruction is opcode + vector of operands v All operands have types v Instruction result is typed 18

Our example, compiled to LLVM int callee(const int *X) { return *X+1; // load

Our example, compiled to LLVM int callee(const int *X) { return *X+1; // load } int caller() { int T; // on stack T = 4; // store return callee(&T); } All loads/stores are Stack allocation is explicit in the LLVM explicit in LLVM representation define internal i 32 @callee(i 32* %X) { entry: %tmp 2 = load i 32* %X %tmp 3 = add i 32 %tmp 2, 1 ret i 32 %tmp 3 } define internal i 32 @caller() { entry: %T = alloca i 32 store i 32 4, i 32* %T %tmp 1 = call i 32 @callee( i 32* %T ) ret i 32 %tmp 1 } 19

Our example, desired transformation define i 32 @callee(i 32* %X) { %tmp 2 =

Our example, desired transformation define i 32 @callee(i 32* %X) { %tmp 2 = load i 32* %X %tmp 3 = add i 32 %tmp 2, 1 ret i 32 %tmp 3 } define internal i 32 @callee 1(i 32 %X. val) { %tmp 3 = add i 32 %X. val, 1 ret i 32 %tmp 3 } define i 32 @caller() { %T = alloca i 32 store i 32 4, i 32* %T %tmp 1 = call i 32 @callee( i 32* %T ) ret i 32 %tmp 1 } define internal i 32 @caller() { %T = alloca i 32 store i 32 4, i 32* %T %Tval = load i 32* %T %tmp 1 = call i 32 @callee 1( i 32 %Tval ) ret i 32 %tmp 1 } Other transformation Update all call sites of Change theinstructions prototype Insert load (-mem 2 reg) cleans up ‘callee’ for intothe allfunction callers the rest define internal i 32 @caller() { %tmp 1 = call i 32 @callee 1( i 32 4 ) ret i 32 %tmp 1 } 20

Tutorial Overview n n Introduction to the running example LLVM C/C++ Compiler Overview v

Tutorial Overview n n Introduction to the running example LLVM C/C++ Compiler Overview v High-level view of an example LLVM compiler n The LLVM Virtual Instruction Set v IR overview and type-system n LLVM C++ IR and important API’s v Basics, Pass. Manager, dataflow, Arg. Promotion n Alias Analysis in LLVM 21

LLVM Coding Basics n Written in modern C++, uses the STL: v Particularly the

LLVM Coding Basics n Written in modern C++, uses the STL: v Particularly the vector, set, and map classes n LLVM IR is almost all doubly-linked lists: v Module contains lists of Functions & Global. Variables v Function contains lists of Basic. Blocks & Arguments v Basic. Block contains list of Instructions n Linked lists are traversed with iterators: Function *M = … for (Function: : iterator I = M->begin(); I != M->end(); ++I) { Basic. Block &BB = *I; . . . See also: docs/Programmers. Manual. html See also: 22

LLVM Coding Basics cont. n Basic. Block doesn’t provide a reverse iterator v Highly

LLVM Coding Basics cont. n Basic. Block doesn’t provide a reverse iterator v Highly obnoxious when doing the assignment for(Basic. Block: : iterator I = bb->end(); I != bb->begin(); ) { --I; Instruction *insn = I; … n Traversing successors of a Basic. Block: for (succ_iterator SI = succ_begin(bb), E = succ_end(bb); SI != E; ++SI) { Basic. Block *Succ = *SI; n C++ is not Java primitive class variable not automatically initialized n you must manage memory n virtual vs. non-virtual functions n and much more… n 23

LLVM Pass Manager n Compiler is organized as a series of ‘passes’: v Each

LLVM Pass Manager n Compiler is organized as a series of ‘passes’: v Each pass is one analysis or transformation n Types of Pass: v Module. Pass: general interprocedural pass v Call. Graph. SCCPass: bottom-up on the call graph v Function. Pass: process a function at a time v Loop. Pass: process a natural loop at a time v Basic. Block. Pass: process a basic block at a time n Constraints imposed (e. g. Function. Pass): v Function. Pass can only look at “current function” v Cannot maintain state across functions See also: docs/Writing. An. LLVMPass. html 24

Services provided by Pass. Manager n Optimization of pass execution: v Process a function

Services provided by Pass. Manager n Optimization of pass execution: v Process a function at a time instead of a pass at a time v Example: If F, G, H are three functions in input pgm: “FFFFGGGGHHHH” not “FGHFGH” v Process functions in parallel on an SMP (future work) n Declarative dependency management: v Automatically fulfill and manage analysis pass lifetimes v Share analyses between passes when safe: n e. g. “Dominator. Set live unless pass modifies CFG” n Avoid boilerplate for traversal of program See also: docs/Writing. An. LLVMPass. html 25

Pass Manager + Arg Promotion #1/2 n Arg Promotion is a Call. Graph. SCCPass:

Pass Manager + Arg Promotion #1/2 n Arg Promotion is a Call. Graph. SCCPass: v Naturally operates bottom-up on the Call. Graph n Bubble pointers from callees out to callers 24: #include "llvm/Call. Graph. SCCPass. h" 47: struct Simple. Arg. Promotion : public Call. Graph. SCCPass { n Arg Promotion requires Alias. Analysis info v To prove safety of transformation n Works with any alias analysis algorithm though 48: virtual void get. Analysis. Usage(Analysis. Usage &AU) AU. add. Required<Alias. Analysis>(); // Get AU. add. Required<Target. Data>(); // Get Call. Graph. SCCPass: : get. Analysis. Usage(AU); // Get } const { aliases data layout Call. Graph 26

Pass Manager + Arg Promotion #2/2 n Finally, implement run. On. SCC (line 65):

Pass Manager + Arg Promotion #2/2 n Finally, implement run. On. SCC (line 65): bool Simple. Arg. Promotion: : run. On. SCC(const std: : vector<Call. Graph. Node*> &SCC) { bool Changed = false, Local. Change; do { // Iterate until we stop promoting from this SCC. Local. Change = false; // Attempt to promote arguments from all functions in this SCC. for (unsigned i = 0, e = SCC. size(); i != e; ++i) Local. Change |= Promote. Arguments(SCC[i]); Changed |= Local. Change; // Remember that we changed something. } while (Local. Change); return Changed; // Passes return true if something changed. } static int foo(int ***P) { return ***P; } static int foo(int P_val_val) { return P_val_val; } 27

Constant Propagation with Def. Use Chains 28

Constant Propagation with Def. Use Chains 28

LLVM Dataflow Analysis n LLVM IR is in SSA form: v use-def and def-use

LLVM Dataflow Analysis n LLVM IR is in SSA form: v use-def and def-use chains are always available v All objects have user/use info, even functions n Control Flow Graph is always available: v Exposed as Basic. Block predecessor/successor lists v Many generic graph algorithms usable with the CFG n Higher-level info implemented as passes: v Dominators, Call. Graph, induction vars, aliasing, GVN, … See also: docs/Programmers. Manual. html 29

Arg Promotion: safety check #1/4 #1: Function must be “internal” (aka “static”) 88: if

Arg Promotion: safety check #1/4 #1: Function must be “internal” (aka “static”) 88: if (!F || !F->has. Internal. Linkage()) return false; #2: Make sure address of F is not taken v In LLVM, check that there are only direct calls using F 99: for (Value: : use_iterator UI = F->use_begin(); UI != F->use_end(); ++UI) { Call. Site CS = Call. Site: : get(*UI); if (!CS. get. Instruction()) // "Taking the address" of F. return false; #3: Check to see if any args are promotable: 114: for (unsigned i = 0; i != Pointer. Args. size(); ++i) if (!is. Safe. To. Promote. Argument(Pointer. Args[i])) Pointer. Args. erase(Pointer. Args. begin()+i); if (Pointer. Args. empty()) return false; // no args promotable 30

Arg Promotion: safety check #2/4 #4: Argument pointer can only be loaded from: v

Arg Promotion: safety check #2/4 #4: Argument pointer can only be loaded from: v No stores through argument pointer allowed! // Loop over all uses of the argument (use-def chains). 138: for (Value: : use_iterator UI = Arg->use_begin(); UI != Arg->use_end(); ++UI) { // If the user is a load: if (Load. Inst *LI = dyn_cast<Load. Inst>(*UI)) { // Don't modify volatile loads. if (LI->is. Volatile()) return false; Loads. push_back(LI); } else { return false; // Not a load. } } 31

Alias Analysis n n The Alias. Analysis class defines the interface that all alias

Alias Analysis n n The Alias. Analysis class defines the interface that all alias analysis support Computed by the basicaa pass Can be changed Simple example int i; char C[2]; char A[10]; /*. . . */ for (i = 0; i != 10; ++i) { C[0] = A[i]; /* One byte store */ C[1] = A[9 -i]; /* One byte store */ } 32

Arg Promotion: safety check #3/4 #5: Value of “*P” must not change in the

Arg Promotion: safety check #3/4 #5: Value of “*P” must not change in the BB v We move load out to the caller, value cannot change! … … load P … Modifie s “*P”? // Get Alias. Analysis implementation from the pass manager. 156: Alias. Analysis &AA = get. Analysis<Alias. Analysis>(); // Ensure *P is not modified from start of block to load 169: if (AA. can. Instruction. Range. Modify(BB->front(), *Load, Arg, Load. Size)) return false; // Pointer is invalidated! See also: docs/Alias. Analysis. html 33

Arg Promotion: safety check #4/4 #6: “*P” cannot change from Fn entry to BB

Arg Promotion: safety check #4/4 #6: “*P” cannot change from Fn entry to BB Entry Modifie s “*P”? Entry load P 175: for (pred_iterator PI = pred_begin(BB), E = pred_end(BB); PI != E; ++PI) // Loop over predecessors of BB. // Check each block from BB to entry (DF search on inverse graph). for (idf_iterator<Basic. Block*> I = idf_begin(*PI); I != idf_end(*PI); ++I) // Might *P be modified in this basic block? if (AA. can. Basic. Block. Modify(**I, Arg, Load. Size)) return false; 34

Arg Promotion: xform outline #1/4 #1: Make prototype with new arg types: #197 v

Arg Promotion: xform outline #1/4 #1: Make prototype with new arg types: #197 v Basically just replaces ‘int*’ with ‘int’ in prototype #2: Create function with new prototype: 214: Function *NF = new Function(NFTy, F->get. Linkage(), F->get. Name()); F->get. Parent()->get. Function. List(). insert(F, NF); #3: Change all callers of F to call NF: // If there are uses of F, then calls to it remain. 221: while (!F->use_empty()) { // Get a caller of F. Call. Site CS = Call. Site: : get(F->use_back()); 35

Arg Promotion: xform outline #2/4 #4: For each caller, add loads, determine args v

Arg Promotion: xform outline #2/4 #4: For each caller, add loads, determine args v Loop over the args, inserting the loads in the caller 220: std: : vector<Value*> Args; 226: Call. Site: : arg_iterator AI = CS. arg_begin(); for (Function: : aiterator I = F->abegin(); I != F->aend(); ++I, ++AI) if (!Args. To. Promote. count(I)) // Unmodified argument. Args. push_back(*AI); else { // Insert the load before the call. Load. Inst *LI = new Load. Inst(*AI, (*AI)->get. Name()+". val", Call); // Insertion point Args. push_back(LI); } 36

Arg Promotion: xform outline #3/4 #5: Replace the call site of F with call

Arg Promotion: xform outline #3/4 #5: Replace the call site of F with call of NF // Create the call to NF with the adjusted arguments. 242: Instruction *New = new Call. Inst(NF, Args, "", Call); // If the return value of the old call was used, use the retval of the new call. if (!Call->use_empty()) Call->replace. All. Uses. With(New); // Finally, remove the old call from the program, reducing the use-count of F. Call->get. Parent()->get. Inst. List(). erase(Call); #6: Move code from old function to new Fn 259: NF->get. Basic. Block. List(). splice(NF->begin(), F->get. Basic. Block. List()); 37

Arg Promotion: xform outline #4/4 #7: Change users of F’s arguments to use NF’s

Arg Promotion: xform outline #4/4 #7: Change users of F’s arguments to use NF’s 264: for (Function: : aiterator I = F->abegin(), I 2 = NF->abegin(); I != F->aend(); ++I, ++I 2) if (!Args. To. Promote. count(I)) { // Not promoting this arg? I->replace. All. Uses. With(I 2); // Use new arg, not old arg. } else { while (!I->use_empty()) { // Only users can be loads. Load. Inst *LI = cast<Load. Inst>(I->use_back()); LI->replace. All. Uses. With(I 2); LI->get. Parent()->get. Inst. List(). erase(LI); } } #8: Delete old function: 286: F->get. Parent()->get. Function. List(). erase(F); 38

Tutorial Overview n n Introduction to the running example LLVM C/C++ Compiler Overview v

Tutorial Overview n n Introduction to the running example LLVM C/C++ Compiler Overview v High-level view of an example LLVM compiler n The LLVM Virtual Instruction Set v IR overview and type-system n LLVM C++ IR and important API’s v Basics, Pass. Manager, dataflow, Arg. Promotion n Alias Analysis in LLVM 39

Alias Analysis n n The Alias. Analysis class defines the interface that all alias

Alias Analysis n n The Alias. Analysis class defines the interface that all alias analysis support Computed by the basicaa pass Can be changed Simple example int i; char C[2]; char A[10]; /*. . . */ for (i = 0; i != 10; ++i) { C[0] = A[i]; /* One byte store */ C[1] = A[9 -i]; /* One byte store */ } 40

Project 1 n n n Improve Pointer Analysis LLVM Study the precision of LLVM

Project 1 n n n Improve Pointer Analysis LLVM Study the precision of LLVM on some examples Suggest improvements by: v Flow sensitivity v Destructive updates 41

Project 2 n n Numeric analyzer in LLVM Integrate LLVM and Apron in a

Project 2 n n Numeric analyzer in LLVM Integrate LLVM and Apron in a reasonable way Intera-procedural only Bonus interprocedural 42

Project 3 n n Shape Analysis in LLVM Integrate LLVM and TVLA in a

Project 3 n n Shape Analysis in LLVM Integrate LLVM and TVLA in a reasonable way Intera-procedural only Generate TVP 43