Wheres My Compiler Developer tools past present and

  • Slides: 46
Download presentation
Where’s My Compiler? Developer tools: past, present, and future Jim Miller Software Architect, Developer

Where’s My Compiler? Developer tools: past, present, and future Jim Miller Software Architect, Developer Frameworks Microsoft Corporation (with help from Carol Eidt, Phoenix Project, Microsoft Corporation)

Outline n n What Is A Compiler? A Brief History of Developer Tools My

Outline n n What Is A Compiler? A Brief History of Developer Tools My First Compilers, compilers, everywhere 18 -Sep-20 Where's My Compiler? 2

What Is A Compiler? n A converter from one representation (source code) to another

What Is A Compiler? n A converter from one representation (source code) to another (executable code) n n Preserves (most of) the meaning of the source One part of a modern “tool chain” used to produce executable artifacts (applications) 18 -Sep-20 Where's My Compiler? 3

A Compiler Source Code Describes desired behavior Compiler Has desired behavior, but Executable Code

A Compiler Source Code Describes desired behavior Compiler Has desired behavior, but Executable Code 18 -Sep-20 • May have different internal structure • May execute in different (unobservable) order Where's My Compiler? 4

Figures of Merit n Code Quality: how efficient is the generated code? n n

Figures of Merit n Code Quality: how efficient is the generated code? n n n Speed and Space: these aren’t independent, but they aren’t the same either Throughput: how fast is the code generated? Footprint: how large is the compiler? 18 -Sep-20 Where's My Compiler? 5

Outline n n What Is A Compiler? A Brief History of Developer Tools My

Outline n n What Is A Compiler? A Brief History of Developer Tools My First Compilers, compilers, everywhere 18 -Sep-20 Where's My Compiler? 6

1950 s: Just a Compiler, Please n The compiler references a runtime, but the

1950 s: Just a Compiler, Please n The compiler references a runtime, but the runtime is supplied by the OS at a fixed location in memory n n n FORTRAN runtime: input/output formatting COBOL runtime: also search and sort OS loader loads the compiler output into memory, transfers control Address space is small (< 8 K word), CPU is slow (< 1, 000 instructions/sec. ) Figure of merit: Code Quality n n Compiler must optimize code for space Compiler must optimize code for speed 18 -Sep-20 Where's My Compiler? 7

Inside the Compiler (in concept) Source Code Front End Compiler Back End Executable Code

Inside the Compiler (in concept) Source Code Front End Compiler Back End Executable Code 18 -Sep-20 Where's My Compiler? 8

Inside the Compiler Source Code (in concept) • Parse source code • Produce abstract

Inside the Compiler Source Code (in concept) • Parse source code • Produce abstract syntax tree (AST) • Produce symbol table Front End • Generate errors • Syntax errors • Type errors Compiler • Unbound references Back End Executable Code 18 -Sep-20 Where's My Compiler? 9

Inside the Compiler (in concept) Source Code Front End • Linearize parse tree •

Inside the Compiler (in concept) Source Code Front End • Linearize parse tree • Code Analysis Compiler • Basic block analysis • Control- and data-flow graph analysis Back End • Optimize (machine-independent) • Redundant and dead code elimination • Code restructuring Executable Code 18 -Sep-20 • Convert to executable code • Register allocation • Peephole optimization • Branch prediction and tensioning Where's My Compiler? 10

1960 s: Linkers n n Programs are growing in size Programs are built with

1960 s: Linkers n n Programs are growing in size Programs are built with libraries n n n Virtual memory systems are invented Tool chain is in two stages n n n Libraries provide reusable code fragments Compile independent modules Combine the modules using a linker Figure of merit: Code quality (speed) 18 -Sep-20 Where's My Compiler? 11

Tools: Compiler + Linker Source Code Front End Back End Object Code Compiler Includes

Tools: Compiler + Linker Source Code Front End Back End Object Code Compiler Includes external references Linker Executable Code 18 -Sep-20 Where's My Compiler? 12

1970 s: Symbolic Debugger n OS written in high-level language n n High-level languages

1970 s: Symbolic Debugger n OS written in high-level language n n High-level languages provide large runtime libraries in multiple units n n n Compilers provide sufficient code performance and low-level access Static linker pulls only required units into a given program image Compiler exports symbol table for use by debugger, not just internal to front-/back-end Figure of merit: Code quality (speed) 18 -Sep-20 Where's My Compiler? 13

Compiler, Linker, Debugger Source Code Front End Back End Object Code Compiler Symbol table(s)

Compiler, Linker, Debugger Source Code Front End Back End Object Code Compiler Symbol table(s) Linker Debugger Running Program 18 -Sep-20 Where's My Compiler? 14

1980 s: Dynamic Loading, Threading n To improve OS performance, by reducing physical memory

1980 s: Dynamic Loading, Threading n To improve OS performance, by reducing physical memory pressure, read/only parts of libraries are shared between applications n n OS loader fixes up references to shared libraries – just like the static linkers n n n Locks, monitors, events, polling Order of operations visible across thread boundaries Memory model semantics become an issue Ada™ introduces rendez-vous, other languages have other constructs Tool chain n n Not all libraries are loaded into the same virtual address Concurrency issues addressed in programming languages n n Loaded on first reference Compiler(s) Linker Loader Symbolic debugger Figure of merit: Code quality (speed, but this is related to space) 18 -Sep-20 Where's My Compiler? 15

OS Dynamic Loader Source Code Front End Back End Object Code Compiler Includes fixups

OS Dynamic Loader Source Code Front End Back End Object Code Compiler Includes fixups for shared code Symbol table(s) Static Linker Image File OS Loader Debugger Running Program 18 -Sep-20 Where's My Compiler? 16

1990 s: JITs and Managed Runtimes n Garbage Collection goes mainstream n n n

1990 s: JITs and Managed Runtimes n Garbage Collection goes mainstream n n n Verification requires runtime to analyze code n n n Typically by a factor of 5 to 15 Tool chain: split the compiler in two! n n Verification is similar to front-end compiler work Can be done to native code, but much simpler with an intermediate language Just-in-time (JIT) compilation increases performance over pure interpretation n n Previously: LISP, APL, Small. Talk 1990 s: Java, Jscript, C#, VB Linearize the AST to create Intermediate Language (IL) Save symbol table as “metadata” Reorder the chain Figures of merit: Throughput first, code quality second 18 -Sep-20 Where's My Compiler? 17

OS Dynamic Loader (repeat) Source Code Front End Back End Object Code Compiler Includes

OS Dynamic Loader (repeat) Source Code Front End Back End Object Code Compiler Includes fixups for shared code Symbol table(s) Static Linker Image File OS Loader Debugger Running Program 18 -Sep-20 Where's My Compiler? 18

OS Dynamic Loader (repeat) Source Code Front End Compiler Back End Object Code Static

OS Dynamic Loader (repeat) Source Code Front End Compiler Back End Object Code Static Linker Image File OS Loader Debugger Running Program 18 -Sep-20 Where's My Compiler? 19

Managed Runtime Source Code Compiler Front End Compiler Image File Back End OS Loader

Managed Runtime Source Code Compiler Front End Compiler Image File Back End OS Loader Object Code Dynamic Linker Runtime Static Linker Image File Back End OS Loader Debugger Running Program 18 -Sep-20 Where's My Compiler? 20

Managed Runtime Metadata + Intermediate Language Compiler Source Code Front End Compiler Image File

Managed Runtime Metadata + Intermediate Language Compiler Source Code Front End Compiler Image File Back End OS Loader Object Code Dynamic Linker Runtime Static Linker Image File Back End OS Loader Debugger Running Program 18 -Sep-20 Where's My Compiler? 21

2000 s: Reflection-based Computation n Reflection: ability of a program to observe and possibly

2000 s: Reflection-based Computation n Reflection: ability of a program to observe and possibly modify its structure and behavior n n n Interactive Development Environments (IDEs) n n Intellisense™ Refactoring Interactive syntax analysis Query Integration n Compilers “preserve meaning” but runtime reflection makes more information visible, so optimizations are more limited Metadata (symbol table) or equivalent needed at runtime, not just compile/link time Builds expression trees (ASTs) at compile time Runtime operations to combine and manipulate them Figures of merit: n n “Compiler” and “JIT compiler”: throughput “Pre-JIT” compiler: balance of throughput and code quality 18 -Sep-20 Where's My Compiler? 22

Runtime Reflection Source Code Front End Development Environment Metadata + Intermediate Language Image File

Runtime Reflection Source Code Front End Development Environment Metadata + Intermediate Language Image File OS Loader Metadata (symbol table) Dynamic Linker Back End Debugger Running Program 18 -Sep-20 Where's My Compiler? 23

Outline n n What Is A Compiler? A Brief History of Developer Tools My

Outline n n What Is A Compiler? A Brief History of Developer Tools My First Compilers, compilers, everywhere 18 -Sep-20 Where's My Compiler? 24

1970: Numbles n n “Number puzzles for Nimble minds” Column in “Computers and Automation”

1970: Numbles n n “Number puzzles for Nimble minds” Column in “Computers and Automation” Numble verifier written by Stuart Nelson Input language: SEND + MORE ====== MONEY n n n Output: a program to try all possible values for letter assignments to digits Handled +, -, *, and = Hand coded in PDP-9 assembly language 18 -Sep-20 Where's My Compiler? 25

Outline n n What Is A Compiler? A Brief History of Developer Tools My

Outline n n What Is A Compiler? A Brief History of Developer Tools My First Compilers, compilers, everywhere n n n Free-standing compilers Under the hood Inside applications In the tool chain Inside libraries 18 -Sep-20 Where's My Compiler? 26

Special-Purpose Compilers n n Compile-to-hardware Aspect-Oriented Programming (AOP) weaver n n Parser finds new

Special-Purpose Compilers n n Compile-to-hardware Aspect-Oriented Programming (AOP) weaver n n Parser finds new syntax to mark insertion points Back-end inserts code snippets for different aspects More generally: “assembly rewriting” Work-flow and object design languages n n Input may be textual or graphic layouts Output may be code or graphic designs 18 -Sep-20 Where's My Compiler? 27

Mark-up Compilers n XML schema (or DTD) n n n Web-services Description (WSDL) n

Mark-up Compilers n XML schema (or DTD) n n n Web-services Description (WSDL) n n n Output: proxy that parses input and dispatches Output: code to convert data structure to XML (“serializer”) XAML (Windows Presentation Framework) n n n Output: parser Output: deserializer Output: parser Output: executable code XSL 18 -Sep-20 Where's My Compiler? 28

Outline n n What Is A Compiler? A Brief History of Developer Tools My

Outline n n What Is A Compiler? A Brief History of Developer Tools My First Compilers, compilers, everywhere n n n Free-standing compilers Under the hood Inside applications In the tool chain Inside libraries 18 -Sep-20 Where's My Compiler? 29

Modern Hardware: CPU n Compile “machine code” to “micro code” n n Part of

Modern Hardware: CPU n Compile “machine code” to “micro code” n n Part of the instruction cache n n CPU Architecture is the abstraction boundary RISC vs CISC is an old debate x 86 and x 64 are CISC on the outside, RISC on the inside Engineering note: an icache miss now often means a pause to compile in addition to a memory fetch! Allows innovation in actual hardware while still running existing code n n n Chips optimized for specific usage scenarios Chips take advantage of materials science advances Chips take advantage of new internal architectures (multicore) 18 -Sep-20 Where's My Compiler? 30

Modern Hardware: Graphics n n Graphics memory isn’t just for data Very sophisticated compilation

Modern Hardware: Graphics n n Graphics memory isn’t just for data Very sophisticated compilation steps Parallel execution with CPU Adapts to changing hardware organization n Raster scan vs vector Resolution, speed, synchronization Adapts to predominant usage pattern n Animation 3 D Shading 18 -Sep-20 Where's My Compiler? 31

Outline n n What Is A Compiler? A Brief History of Developer Tools My

Outline n n What Is A Compiler? A Brief History of Developer Tools My First Compilers, compilers, everywhere n n n Free-standing compilers Under the hood Inside applications In the tool chain Inside libraries 18 -Sep-20 Where's My Compiler? 32

Databases n SQL is a full programming language n n Compiled to intermediate form

Databases n SQL is a full programming language n n Compiled to intermediate form on client Intermediate form is passed to server for execution Server optimizes the intermediate form to produce an “execution plan” Query optimization n Additional inputs include n n Size of tables Frequency of query types Indexing information Outputs include n n 18 -Sep-20 Executable code Temporary indexes Background indexing requests Updated frequency information Where's My Compiler? 33

Hardware Emulators n Object code translation at runtime n n Alternate hardware emulation n

Hardware Emulators n Object code translation at runtime n n Alternate hardware emulation n n HP 3000 to PA-RISC in 1983 Vax to Alpha in 1990 s 32 -bit programs on 64 -bit hardware Device emulators for everything from smart cards to cell phones to i. Pod to pocket PCs JIT compilation trades start-up time for high performance execution n Often, but not always, a good trade-off 18 -Sep-20 Where's My Compiler? 34

Code Analysis Tools n Analyzing API surface n n “Remodularizing” implementation n Simple to

Code Analysis Tools n Analyzing API surface n n “Remodularizing” implementation n Simple to do with front end ASTs Requires static and dynamic dependency analysis – normal compiler back end work Requires rebuilding the program, easily done using front end ASTs Race detection n n Instrument code at compile time Gather data as it runs under high stress 18 -Sep-20 Where's My Compiler? 35

“Tree Shakers” n n Start with AST tree and appropriate dependency graph Pull AST

“Tree Shakers” n n Start with AST tree and appropriate dependency graph Pull AST nodes found starting at a given graph node, recursively Convert resulting set of AST nodes to appropriate output format Example uses: n n Subset library based on initial set of types Statically link subset of library for a given application 18 -Sep-20 Where's My Compiler? 36

Outline n n What Is A Compiler? A Brief History of Developer Tools My

Outline n n What Is A Compiler? A Brief History of Developer Tools My First Compilers, compilers, everywhere n n n Free-standing compilers Under the hood Inside applications In the tool chain Inside libraries 18 -Sep-20 Where's My Compiler? 37

A Modern Interactive Development Environment (IDE) n Code editor n n Project system n

A Modern Interactive Development Environment (IDE) n Code editor n n Project system n n n Orders clean-up, compile, and link operations Debugger n n n Tracks the public shape of components Tracks dependencies between components Build system n n Knows the programming language, provides syntax support and contextsensitive name lookup Allows inspection and modification of values at runtime Allows control operations (e. g. , breakpoint, continue, restart) Dynamic Support n n Allows program modification interwoven with execution (“edit and continue”) Global interaction space (“read-eval-print loop”) 18 -Sep-20 Where's My Compiler? 38

Compilers in the IDE (I) n In the code editor n n Incrementally parses

Compilers in the IDE (I) n In the code editor n n Incrementally parses the code as it is being entered. Note: must deal with incorrect syntax and partial programs. Suggests possible completions based on a symbol table. Note: symbol table must include external references maintained by the project system. Refactoring operations require both syntactic and semantic analysis. Note: refactoring requires information maintained by the project system. In the debugger n Expression evaluation 18 -Sep-20 Where's My Compiler? 39

Compilers In the IDE (II) n Dynamic support n Edit-and-continue n n n Requires

Compilers In the IDE (II) n Dynamic support n Edit-and-continue n n n Requires a full, incremental compiler For efficiency, it also requires the ability to compress the output as a “diff” between the original and the new code Interactive workspace n n Like LISP, APL, Small. Talk, Python, etc. Requires n n n 18 -Sep-20 a compiler or an interpreter -- really, a compiler front end to generate an AST combined with a tree walker to execute the tree. The compiler must be capable of generating code that uses code and objects resident in the evaluation environment, which generally means a reliance on reflection. Where's My Compiler? 40

Compilers in the Linker n n The linker sees “the whole program”, so it’s

Compilers in the Linker n n The linker sees “the whole program”, so it’s better positioned to do global analysis Solution: write a compiler n n n Optimizations: n n Input language is object file format (native code or IL) Output language is OS image file format Aggressive in-lining across module boundaries Code motion across module boundaries Full type system analysis (treat leaf types as sealed) Issues: n n n These flow graphs are *big* The linker doesn’t see the whole program (dynamic linking) Reflection and dynamic linking reduce permitted optimizations n 18 -Sep-20 Or require the ability to back out or recompute optimizations at runtime Where's My Compiler? 41

Profile-Guided Optimization n n Idea: Instrument the program, run it with typical loads, then

Profile-Guided Optimization n n Idea: Instrument the program, run it with typical loads, then re-optimize using this profiling data. (Similar to “Hotspot”) Optimizations: n Optimize only “hot” code fragments n n So you can spend more time on them Method and basic block reordering to increase code density Code reordering to optimize branch prediction and minimize “long” references Cache locality optimizations for data and code 18 -Sep-20 Where's My Compiler? 42

Outline n n What Is A Compiler? A Brief History of Developer Tools My

Outline n n What Is A Compiler? A Brief History of Developer Tools My First Compilers, compilers, everywhere n n n Free-standing compilers Under the hood Inside applications In the tool chain Inside libraries 18 -Sep-20 Where's My Compiler? 43

For the Developer n “Regular expression” parsing n n Grammar is usually more powerful

For the Developer n “Regular expression” parsing n n Grammar is usually more powerful than regular expressions Serialization and Deserialization n n Reflects on data type to be marshalled Generates specialized code to convert to stream format (serialization) or parse into in-memory format (deserialization) 18 -Sep-20 Where's My Compiler? 44

For the Compiler Writer n Parser-generators n n n AST tool kits n n

For the Compiler Writer n Parser-generators n n n AST tool kits n n n Microsoft is investing in this area Provides integration into may aspects of the IDE Executable file format tool kits n n lex yacc Queensland University of Technology PERWAPI Optimization tool kits n Microsoft’s Phoenix project 18 -Sep-20 Where's My Compiler? 45

Questions? 18 -Sep-20 Where's My Compiler? 46

Questions? 18 -Sep-20 Where's My Compiler? 46