Architecture for a NextGeneration GCC Chris Lattner Vikram

  • Slides: 22
Download presentation
Architecture for a Next-Generation GCC Chris Lattner Vikram Adve sabre@nondot. org vadve@cs. uiuc. edu

Architecture for a Next-Generation GCC Chris Lattner Vikram Adve sabre@nondot. org vadve@cs. uiuc. edu http: //llvm. cs. uiuc. edu/ The First Annual GCC Developers' Summit May 26, 2003

GCC Optimizer Problems: n Scope of optimization is very limited: v Most transformations work

GCC Optimizer Problems: n Scope of optimization is very limited: v Most transformations work on functions… n …and one is even limited to extended basic blocks v No whole-program analyses or optimization! n e. g. alias analysis must be extremely conservative n Tree & RTL are bad for mid-level opt’zns: v Tree is language-specific and too high-level v RTL is target-specific and too low-level Chris Lattner – sabre@nondot. org

New Optimization Architecture: n Transparent link-time optimization: v Completely compatible with user makefiles n

New Optimization Architecture: n Transparent link-time optimization: v Completely compatible with user makefiles n Enables sophisticated interprocedural analyses (IPA) and optimizations (IPO): v Increase the scope of analysis and optimization n A new representation for optimization: v Typed, SSA-based, three-address code v Source language and target-independent Chris Lattner – sabre@nondot. org

Example Applications for GCC: n Fix inlining heuristics: v Allows whole program, bottom-up inlining

Example Applications for GCC: n Fix inlining heuristics: v Allows whole program, bottom-up inlining v Cost metric is more accurate than for trees n Improved alias analysis: v Dramatically improved precision v Code motion, redundancy elimination gains n Work around low-level ABI problems: v Tailor linkage of functions with IP information Chris Lattner – sabre@nondot. org

Talk Outline: n High-Level Compiler Architecture v How does the proposed GCC work? n

Talk Outline: n High-Level Compiler Architecture v How does the proposed GCC work? n Code Representation Details v What does the representation look like? n LLVM: An Implementation v Implementation status and experiences n Conclusion Chris Lattner – sabre@nondot. org

Traditional GCC Organization: Compile: source to target assembly n Assemble: target assembly to object

Traditional GCC Organization: Compile: source to target assembly n Assemble: target assembly to object file n Link: combine object files into an executable n Compile Time Source Link Time Assembly cc 1 as cc 1 plus as … as Object Files Executable ld Libs Chris Lattner – sabre@nondot. org

Proposed GCC Architecture: n Split the existing compiler in half: v Parsing & semantic

Proposed GCC Architecture: n Split the existing compiler in half: v Parsing & semantic analysis at compile time v Code generation at link-time v Optimization at compile-time and link-time Compile Time Source Tree New Representation Link Time RTL GCC Frontend Mid-level Optimize Whole-Program GCC Link Optimize Backend GCC Frontend Mid-level Optimize as ld Assembly Executable Libs Chris Lattner – sabre@nondot. org

Why Link-Time? n Fits into normal compile & link model: v User makefiles do

Why Link-Time? n Fits into normal compile & link model: v User makefiles do not have to change v Enabled if compiling at -O 4 n Missing code severely limits IPA & IPO: v Must make conservative assumptions: n An unknown callee can do just about anything v At link-time, most of the program is available for the first time! Chris Lattner – sabre@nondot. org

Making Link-Time Opt Feasible: n Many commercial compilers support link-time optimization (Intel, SGI, HP,

Making Link-Time Opt Feasible: n Many commercial compilers support link-time optimization (Intel, SGI, HP, etc…): v These export an AST-level representation, then perform all optimization at link-time n Our proposal: v Optimize as much at compile-time as possible v Perform aggressive IPA/IPO at link-time v Allows mixed object files in native & IR format Chris Lattner – sabre@nondot. org

No major GCC changes: n New GCC components: v New expander from Tree to

No major GCC changes: n New GCC components: v New expander from Tree to IR v New expander from IR to RTL v Must extend the compiler driver n Existing code path can be retained: v When disabled, does not effect performance v When -O 2 is enabled, use new mid-level optimizations a function- (or unit-) at-a-time Chris Lattner – sabre@nondot. org

Talk Outline: n High-Level Compiler Architecture v How does the proposed GCC work? n

Talk Outline: n High-Level Compiler Architecture v How does the proposed GCC work? n Code Representation Details v What does the representation look like? n LLVM: An Implementation v Implementation status and experiences n Conclusion Chris Lattner – sabre@nondot. org

Code Representation Properties: n Low-Level, SSA based, and “RISC-like”: v SSA-based ≡ efficient, sparse,

Code Representation Properties: n Low-Level, SSA based, and “RISC-like”: v SSA-based ≡ efficient, sparse, global opt’zns v Orthogonal, as few operations as possible v Simple, well defined semantics (documented) v Simplify development of optimizations: n n Development & maintenance is very costly! Concrete details come from LLVM: v More details about LLVM come later in talk Chris Lattner – sabre@nondot. org

Code Example: struct pair { int X; float Y; }; void Sum(float *, struct

Code Example: struct pair { int X; float Y; }; void Sum(float *, struct pair *P); int Process(float *A, int N) { int i; struct pair P = {0, 0}; for (i = 0; i < N; ++i) { Sum(A, &P); A++; } return P. X; } Explicit allocation of stack Typed Simple Control High-level space, SSApointer representation type flow clear operations example, isarithmetic distinction loweredand is are to for between memory and lowered explicit example use explicit to access external simple in the branches tooperations code memory function registers tmp. 0 = &P[0]. 0 %pair = type { int, float } declare void %Sum(float* , %pair*) %Sum(float*, %pair*) int %Process(float* %A. 0, int %N) { entry: %P = alloca %pair %tmp. 0 = getelementptr %pair* %P, long 0, ubyte 0 store int 0, int* %tmp. 0 %tmp. 1 = getelementptr %pair* %P, long 0, ubyte 1 store float 0. 0, float* %tmp. 1 %tmp. 3 = setlt int 0, %N br bool %tmp. 3, label %loop, label %return loop: A. 2 = &A. 1[1] %i. 1 = phi int [ 0, %entry ], [ %i. 2, %loop ] %A. 1 = phi float* [ %A. 0, %entry ], [ %A. 2, %loop ] call void %Sum(float* %A. 1, %pair* %P) %A. 2 = getelementptr float* %A. 1, long 1 %i. 2 = add int %i. 1, 1 %tmp. 4 = setlt int %i. 1, %N br bool %tmp. 4, label %loop, label %return: %tmp. 5 = load int* %tmp. 0 ret int %tmp. 5 } Chris Lattner – sabre@nondot. org

Strongly-Typed Representation: n Key challenge: v Support high-level analyses & transformations v. . .

Strongly-Typed Representation: n Key challenge: v Support high-level analyses & transformations v. . . on a low-level representation! n Types provide this high-level info: v Enables aggressive analyses and opt’zns: n e. g. automatic pool allocation, safety checking, data structure analysis, etc… v Every computed value has a type n Type system is language-neutral! Chris Lattner – sabre@nondot. org

Type System Details: n Simple lang. independent type system: v Primitives: void, bool, float,

Type System Details: n Simple lang. independent type system: v Primitives: void, bool, float, ushort, opaque, … v Derived: pointer, array, structure, function v No high-level types! n Source language types are lowered: v e. g. T& T* v e. g. class T : S { int X; } { S, int } n Type system can be “broken” with casts Chris Lattner – sabre@nondot. org

Full Featured Language: n Should contain all info about the code: v functions, globals,

Full Featured Language: n Should contain all info about the code: v functions, globals, inline asm, etc… v Should be possible to serialize and deserialize a program at any time n Language has binary and text formats: v Both directly correspond to in-memory IR v Text is for humans, binary is faster to parse v Makes debugging and understanding easier! Chris Lattner – sabre@nondot. org

Talk Outline: n High-Level Compiler Architecture v How does the proposed GCC work? n

Talk Outline: n High-Level Compiler Architecture v How does the proposed GCC work? n Code Representation Details v What does the representation look like? n LLVM: An Implementation v Implementation status and experiences n Conclusion Chris Lattner – sabre@nondot. org

LLVM: Low-Level Virtual Machine n A research compiler infrastructure: v Provides a solid foundation

LLVM: Low-Level Virtual Machine n A research compiler infrastructure: v Provides a solid foundation for research v In use both inside and outside of UIUC: Compilers, architecture, & dynamic compilation n Two advanced compilers courses n n Development Progress: v 2. 5 years old, ~130 K lines of C++ code v First public release is coming soon: n 1. 0 release this summer, prereleases via email Chris Lattner – sabre@nondot. org

LLVM Implementation Status: n Most of this proposal is implemented: v Tree LLVM expander

LLVM Implementation Status: n Most of this proposal is implemented: v Tree LLVM expander (for C and C++) v Linker, optimizer, textual & bytecode formats v Mid-level optimizer is sequence of 22 passes n All sorts of analyses & optimizations: v Scalar: ADCE, SCCP, register promotion, … v CFG: dominators, natural loops, profiling, … v IP: alias analysis, automatic pool allocation, interprocedural mod/ref, safety verification… Chris Lattner – sabre@nondot. org

Other LLVM Infrastructure: n Direct execution of LLVM bytecode: v A portable interpreter, a

Other LLVM Infrastructure: n Direct execution of LLVM bytecode: v A portable interpreter, a Just-In-Time compiler n Several custom (non-GCC) backends: v Sparc-V 9, IA-32, C backend n The LLVM “Pass Manager”: v Declarative system for tracking analysis and optimizer pass dependencies v Assists building tools out of a series of passes Chris Lattner – sabre@nondot. org

LLVM Development Tools: n Invariant checking: v Automatic IR memory leak detection v A

LLVM Development Tools: n Invariant checking: v Automatic IR memory leak detection v A verifier pass which checks for consistency n n Definitions dominate all uses, etc… Bugpoint - automatic test-case reducer: v Automatically reduces test cases to a small example which still causes a problem v Can debug miscompilations or pass crashes Chris Lattner – sabre@nondot. org

Conclusion: n Contributions: v A realistic architecture for an aggressive link- time optimizer v

Conclusion: n Contributions: v A realistic architecture for an aggressive link- time optimizer v A representation for efficient and powerful analyses and transformations n LLVM is available… v … and we appreciate your feedback! http: //llvm. cs. uiuc. edu Chris Lattner – sabre@nondot. org