The Phoenix Compiler and Tools Framework Andy Ayers

  • Slides: 26
Download presentation
The Phoenix Compiler and Tools Framework Andy Ayers Microsoft Phoenix Andy. A@microsoft. com

The Phoenix Compiler and Tools Framework Andy Ayers Microsoft Phoenix Andy. A@microsoft. com

What is Phoenix? • Phoenix is a codename for Microsoft’s next-generation, state of the

What is Phoenix? • Phoenix is a codename for Microsoft’s next-generation, state of the art infrastructure for program analysis and transformation

Why Phoenix? VS

Why Phoenix? VS

Phoenix Goals An industry leading compilation and tools framework A rich ecosystem for academic

Phoenix Goals An industry leading compilation and tools framework A rich ecosystem for academic research and industrial users An infrastructure that is robust retargetable extensible configurable scalable

Overview AST Tools. Net Code. Gen • Runtime JITs • Pre-JIT • OO and.

Overview AST Tools. Net Code. Gen • Runtime JITs • Pre-JIT • OO and. Net optimizations Native Code. Gen • Advanced C++/OO Optimizations • FP optimizations • Open. MP • Static Analysis Tools • Next Gen Front-Ends • R/W Global Program Views Phoenix Infrastructure • Language Research • Direct xfer to Phoenix • Research Insulated from code generation MSR & Partner Tools • • • Built on Phoenix API’s Both HL and LL API’s Managed API’s Program Analysis Program Rewrite Academic RDK Retargetable • “Machine Models” • ~3 months: -Od • ~3 months: -O 2 MSR Adv Lang Chip Vendor CDK • ~6 month ports • Sample port + docs • Managed API’s • IP as DLLs • Docs

Code Gen Tools Code Gen LL Opts HL Opts Compilers Browser Visualizer Lint Formatter

Code Gen Tools Code Gen LL Opts HL Opts Compilers Browser Visualizer Lint Formatter Obfuscator Refactor Xlator Profiler Security Checker Phx APIs Phoenix Core AST Native Image assembly C# VB C++ Delphi Cobol Eiffel IR Syms C++ IL C++ Types CFG Pre. Fast SSA Phx AST Lex/Yacc Tiger Profile

Phoenix Architecture • Core set of extensible classes to represent ● IR, Symbols, Types,

Phoenix Architecture • Core set of extensible classes to represent ● IR, Symbols, Types, Graphs, Trees, Regions • Layered set of analysis and transformations components ● Data Flow Analysis, Loops, Aliasing, Dead Code, Redundant Code, Inlining • Common input/output library for binary formats ● PE, LIB, OBJ, CIL, MSIL, PDB

Demo 1: Code Generation • Microsoft C++ compiler ● ● Input: program text Output:

Demo 1: Code Generation • Microsoft C++ compiler ● ● Input: program text Output: COFF object file We’ll demo a Phoenixbased c 2 Driver (CL) C++ Source Frontend (C 1) Backend (C 2) Obj File

IR States Abstract AST Concrete HIR MIR LIR EIR Lowering Raising • Phases transform

IR States Abstract AST Concrete HIR MIR LIR EIR Lowering Raising • Phases transform IR, either within a state or from one state to another. • For instance, Lower transforms MIR into LIR.

View inside Phoenix-Based C 2 AST S O U R C E C 1

View inside Phoenix-Based C 2 AST S O U R C E C 1 HIR MIR CIL Reader MIR Lower Type Checker SSA Const SSA Dest C Canon I Addr Modes L C 2 LIR EIR Lower Encode Reg Alloc Lister EH Lower Stack Alloc Frame Gen Switch Lower Block Layout Flow Opts O B J E C T

Extending Phoenix • All Phoenix clients can host plug-ins • Plug-ins can ● ●

Extending Phoenix • All Phoenix clients can host plug-ins • Plug-ins can ● ● ● Add new components Extend existing components Reconfigure clients • Extensibility relies on ● ● Reflection Events & Delegates

Example: Uninitialized Local Detection • Would like to warn the user that ‘x’ is

Example: Uninitialized Local Detection • Would like to warn the user that ‘x’ is not initialized before use • To do this we need to perform a dataflow analysis within the compiler • We’ll add a phase to C 2 to do this, via a plug-in int foo() { int x; return x; }

Detecting an Uninitialized Use • For each local variable v ● ● Examine all

Detecting an Uninitialized Use • For each local variable v ● ● Examine all paths from the entry of the method to each use of v If on every path v is not initialized before the use: • v must be used before it is defined ● If there is some path where v is not initialized before the use: • v may be used before it is defined

Classic Solution • • Build control flow graph, solve data flow problem Unknown is

Classic Solution • • Build control flow graph, solve data flow problem Unknown is the “state of v” at start of each block: Undefined • Defined Transfer function relates output of block to input: If block contains v= Else output = input • Mixed start Meet combines outputs from predecessor blocks v= =v must v= =v may

Code sketch using dataflow bool changed = true; while (changed) { for each (Phx:

Code sketch using dataflow bool changed = true; while (changed) { for each (Phx: : Graphs: : Basic. Block block in func) { STATE ^ in. State = in. States[block]; bool first. Pred = true; for each(Phx: : Graphs: : Basic. Block pred. Block in block->Predecessors) { STATE ^ pred. State = out. States[pred. Block]; in. State = meet(in. State, pred. State); } Update input state in. States[id] = in. State; STATE ^ new. Out. State = gcnew STATE(in. State); for each(Phx: : IR: : Instr ^ instr in block->Instrs) { for each (Phx: : IR: : Opnd ^ opnd in instr->Dst. Opnds) { Phx: : Syms: : Local. Var. Sym ^ local. Sym = opnd->Sym->As. Local. Var. Sym; new. Out. State[local. Sym] = dst(new. Out. State[local. Sym]); } } STATE ^ out. State = out. States[id]; bool block. Changed = ! equals(new. Out. State, out. State); if (block. Changed) { changed = true; out. States[id] = new. Out. State; } } } Compute output state Check for convergence

Demo: Unintialized Local Plug-In Uninitialized. Local. cpp Test. cpp C++/CLI C 1 Uninitialzed. Local.

Demo: Unintialized Local Plug-In Uninitialized. Local. cpp Test. cpp C++/CLI C 1 Uninitialzed. Local. dll Phx-C 2 Test. obj To Run: cl -d 2 plugin: Uninitialized. Local. dll -c Test. cpp

Demo 3: Phoenix PE Explorer • Phoenix can also read and write PE files

Demo 3: Phoenix PE Explorer • Phoenix can also read and write PE files directly ● ● Implement your own compiler or linker Create post link tools for analysis, instrumentation or optimization • Phx-Explorer is only ~800 LOC client code on top of Phoenix core library

Demo 4: Binary Rewriting • mtrace injects tracing code into managed applications

Demo 4: Binary Rewriting • mtrace injects tracing code into managed applications

Phoenix IR vs MSIL • Phoenix IR makes everything explicit: ● ● ● Operands

Phoenix IR vs MSIL • Phoenix IR makes everything explicit: ● ● ● Operands Control flow Exception handling Side effects Memory model Better format for analysis and transformation • Identical model for. Net and native code ● Many analyses don’t need to make a distinction

Current Status • RDKs released every 6 mos (May 06) with regular updates. •

Current Status • RDKs released every 6 mos (May 06) with regular updates. • Phoenix is now building Vista • ~15 universities engaged via academic program • Code quality, code size, features, compile times not yet on par with the retail product (but closing ground fast).

Recap • Phoenix is a powerful and flexible framework for compilers & tools ●

Recap • Phoenix is a powerful and flexible framework for compilers & tools ● ● C 2 backend PE file read/write JIT & Pre. JIT (not shown) Universal plugins on a common IR • You can use the same components we use in your own work. ● ● ● Download available now Prerequisite: VS 2005 (VC++ Express will work, mostly) Evaluation license prohibits redist or commercial use

More Info • http: //research. microsoft. com/phoenix

More Info • http: //research. microsoft. com/phoenix

Summary • Phoenix is Microsoft’s next-generation tools and code generation framework • It’s written

Summary • Phoenix is Microsoft’s next-generation tools and code generation framework • It’s written entirely in C++/CLI • It’s available for you to experiment with now…

Questions? http: //research. microsoft. com/phoenix andya@microsoft. com

Questions? http: //research. microsoft. com/phoenix andya@microsoft. com