Compilers in Real Life Dave Mandelin 2 Dec

  • Slides: 50
Download presentation
Compilers in Real Life Dave Mandelin 2 Dec 2004

Compilers in Real Life Dave Mandelin 2 Dec 2004

Software Development Time From The Mythical Man-Month by Fred Brooks Design 1/3 Test 1/3

Software Development Time From The Mythical Man-Month by Fred Brooks Design 1/3 Test 1/3 1/2 Design 1/2 Code Test 1/6 Code n n Can we do more error checking and less testing? Better yet, can we avoid writing bugs in the first place?

Software Maintenance n Maintenance is ¡ ¡ n Fixing bugs Enhancing functionality Improving performance

Software Maintenance n Maintenance is ¡ ¡ n Fixing bugs Enhancing functionality Improving performance Refactoring 60/60 Rule ¡ ¡ 60% of project cost is maintenance 60% of maintenance is enhancements 30% of maintenance cost is reading existing code From Facts and Fallacies of Software Engineering by Robert Glass

Lessons from Real Life n Software needs to be ¡ ¡ Reliable Maintainable Understandable

Lessons from Real Life n Software needs to be ¡ ¡ Reliable Maintainable Understandable …especially if it’s any good.

Solutions for Real Life n n How can we write reliable, maintainable, understandable software?

Solutions for Real Life n n How can we write reliable, maintainable, understandable software? Design a new language! ¡ ¡ ¡ A language specially designed to handle your problem Program is short, focused on task “Junk” implementation details hidden n ¡ ¡ And maintainable in one place Error checking Error avoidance

Celebrity Endorsements

Celebrity Endorsements

Compilers are Software n Programming language tools need to be maintainable, understandable too ¡

Compilers are Software n Programming language tools need to be maintainable, understandable too ¡ n Compilers, code analyzers, debuggers We could design special languages to help implement our languages ¡ Too much for most projects n n Can be done, though (PA 3, yacc) Focus on simplicity instead

Case Study 1: Search Results Project Search Department Search

Case Study 1: Search Results Project Search Department Search

The Problem n n Many search types Want same look and feel for all

The Problem n n Many search types Want same look and feel for all ¡ n Easy to learn, use, and understand Need different result format ¡ Different titles, links

Solution 1: Spaghetti code if (type == PROJECT) { link 1 = “project. asp?

Solution 1: Spaghetti code if (type == PROJECT) { link 1 = “project. asp? ” + name; link 2 = “grant. asp? ” + id; } else { link 1 = “dept. asp? ” + id; link 2 = null; } … System. out. println(link 1); if (type == PROJECT) { System. out. println(link 2); } n n Maybe it works, maybe you get fired Unmaintainable

Solution 2: Write it over and over n Write each search page as a

Solution 2: Write it over and over n Write each search page as a separate class ¡ n Maybe Alice does departments, Bob does projects, … Hard to keep consistent look and feel

Solution 3: Recipes n Write each search page as a separate class ¡ ¡

Solution 3: Recipes n Write each search page as a separate class ¡ ¡ Follow a fixed recipe each time Example: recursive descent parsing n n Follow a fixed recipe for each production Good strategy ¡ But not the best!

Recipes n What’s good about recipes? ¡ ¡ n Figure out how to do

Recipes n What’s good about recipes? ¡ ¡ n Figure out how to do it only once Avoid bugs if the recipe is correct What’s wrong with recipes? ¡ ¡ ¡ Type it in many times Can type in bugs each time Boring

A Better Way n Factor out the repetition ¡ Describe the differences with a

A Better Way n Factor out the repetition ¡ Describe the differences with a notation n n ¡ PA 3: grammar file Search: describe result format Implement the repeated parts with interpreters, compilers, and libraries n n PA 3: parsing engine, table generator Search: interpreter

RFL: Result Format Language Column title=“Dept” source_data=“dept” type=STRING link=“dept. asp? deptid={ID}” title=“Name” source_data=”description” type=STRING

RFL: Result Format Language Column title=“Dept” source_data=“dept” type=STRING link=“dept. asp? deptid={ID}” title=“Name” source_data=”description” type=STRING

Report Format Language n n n A configuration language for reports Syntactic sugar for

Report Format Language n n n A configuration language for reports Syntactic sugar for the recipe code Raises level of abstraction ¡ Java has abstraction features, too n n methods, classes Sometimes Java is not good enough ¡ n PA 3: parsing table is unreadable Need a new language

RFL Interpreter n n Search results come from database RFL program is an AST

RFL Interpreter n n Search results come from database RFL program is an AST Created programmatically – no front end ¡ n Run RFL program on each result tuple 340200, ”Admin” 340300, ”Outreach” RFL Interpreter Column title=“Dept” source_data=“dept” type=STRING link=“dept. asp? deptid={ID }” title=“Name” source_data=”description” type=STRING <tr><td><a href=…

RFL Interpreter n Allowed rapid development of many search pages n One day, a

RFL Interpreter n Allowed rapid development of many search pages n One day, a user sends an email… ¡ Site is slow when displaying 5000 search results n ¡ Don’t ask What can we do?

Running RFL n Interpreter for col in columns // Visit each column Object data

Running RFL n Interpreter for col in columns // Visit each column Object data = row. get. Data(col. name); String s = col. format(data); if (col. has. Link()) { col. write. Link(row); } print(s); if (col. has. Link()) { print(“</a>”); } n Hand-written // First column data = row. get. Data(“name”); s = col. format(data); col. write. Link(row); print(s); print(“</a>”); // Second column data = row. get. Data(“title”); s = col. format(data); print(s);

RFL Compiler n n n a. k. a. code generator Compile ASTs to HLL

RFL Compiler n n n a. k. a. code generator Compile ASTs to HLL code (VBScript) Performs easy optimizations Loop unrolling ¡ Constant propagation ¡ n Easy because compiler knows which assignments it is generating n 10 x speedup

Expressiveness Configuration languages RFL Expressiveness, Maintenance Effort Little languages make, PA 2 lexer spec

Expressiveness Configuration languages RFL Expressiveness, Maintenance Effort Little languages make, PA 2 lexer spec Domain-specific languages (DSLs) VHDL, Post. Script, Unreal. Script General-purpose languages (GPLs) Java, Perl, Decaf

Implementation Performance Interpreter RFL Interpreter Execution Speed, Development Effort Basic Compiler PA 4 Optimizing

Implementation Performance Interpreter RFL Interpreter Execution Speed, Development Effort Basic Compiler PA 4 Optimizing Compiler javac, RFL Compiler Fancy Optimizing Compiler PA 6, gcc

Usability Author Usability, Language Design Effort one-off code generators Hackers RFL, X configuration scripts

Usability Author Usability, Language Design Effort one-off code generators Hackers RFL, X configuration scripts Programmers Java, Perl, Decaf Users Unreal. Script

Evolution of RFL Interpreter Config Language Little Language DSL GPL Compiler v 0. 001

Evolution of RFL Interpreter Config Language Little Language DSL GPL Compiler v 0. 001 Interpreter Fancy Compiler

Case Study 2: Little Reports Revenue 60, 000 Deferred Revenue 14, 000 Expenses 70,

Case Study 2: Little Reports Revenue 60, 000 Deferred Revenue 14, 000 Expenses 70, 000 Profit 4, 000

CBL: Cash Balance Language n ‘Profit’=GROUP((REV + DEF) + EXP) ¡ ¡ ‘title’=GROUP(…) is

CBL: Cash Balance Language n ‘Profit’=GROUP((REV + DEF) + EXP) ¡ ¡ ‘title’=GROUP(…) is CBL syntax REV n n n Like a primitive zero-argument function Evaluated using a database query What happens if we need values from a web site? ¡ ¡ Need extensibility CBL has an interface for implementing new primitives by writing a simple class

Error Checking in CBL n Debits and credits are confusing ¡ ¡ n Which

Error Checking in CBL n Debits and credits are confusing ¡ ¡ n Which is right, REV – DEF or REV + DEF? “That’s like asking the square root of million. No one will ever know. ” – Nelson Muntz A type system ¡ ¡ Two types: UP and DOWN Same types must add, different types must subtract Can check this statically Is there a better way?

Error Avoidance in CBL n n Just type REV ± DEF CBL figures out

Error Avoidance in CBL n n Just type REV ± DEF CBL figures out the right operation Program is underconstrained Language implementation uses inference to select operations

CBL Implementation n Like PA 1 -PA 3, but simpler ¡ Hand-written DFA lexer

CBL Implementation n Like PA 1 -PA 3, but simpler ¡ Hand-written DFA lexer n ¡ Hand-written recursive descent parser n ¡ I didn’t have a lexer generator Works well for little languages Interpreter n n n Operation inference Expression evaluator Extension interface

CBL In Practice n I developed it in a few days ¡ n Gave

CBL In Practice n I developed it in a few days ¡ n Gave the code to another CS 164 graduate, who ¡ ¡ n It was easy after PA 1 -PA 3 Added some new features Started writing programs Users ask for a new report ¡ It’s done in 60 seconds

CBL Evaluation n n Much better than RFL Text-based language Error avoidance Maintainable implementation

CBL Evaluation n n Much better than RFL Text-based language Error avoidance Maintainable implementation

Break n After the break… ¡ DSLs for game programming

Break n After the break… ¡ DSLs for game programming

Case Study 3: Unreal. Script

Case Study 3: Unreal. Script

The Unreal Engine n The Unreal engine is the game engine which powered Unreal,

The Unreal Engine n The Unreal engine is the game engine which powered Unreal, and many more since. ¡ n Unreal, Unreal 2, UT 2003, UT 2004, Deep Space 9: The Fallen, Deus Ex: Invisible War, Postal 2, Duke Nukem Forever, … Since it was as customizable as Quake and featured its own scripting language Unreal. Script, it soon had a large community on the internet which added new modifications to change or enhance game play. From http: //en 2. wikipedia. org/wiki/Unreal

Customizing Games n Unreal and similar games ¡ Multiplayer simulations on the Internet n

Customizing Games n Unreal and similar games ¡ Multiplayer simulations on the Internet n ¡ ¡ ¡ Unreal Tournament 2004, Ever. Quest 2, The Sims Online Customers expect to be able to download new characters, levels, game types, and to make their own Are customers going to write 10 k lines of C to add a surprise birthday party to The Sims? In-house game designers don’t necessarily want to use C either

Customizing Games n Game-specific programming concepts ¡ Independent actors n E. g. , person,

Customizing Games n Game-specific programming concepts ¡ Independent actors n E. g. , person, car, elevator ¡ ¡ n Have behavior ¡ n Java methods, sounds OK Behavior depends on current state ¡ ¡ Sounds like a Java class Or it is a thread? And can we have 10 k threads? Class or methods change over time? Can’t do that! Events, duration, networking

Unreal. Script n Design Goals ¡ ¡ From http: //unreal. epicgames. com/Unreal. Script. htm

Unreal. Script n Design Goals ¡ ¡ From http: //unreal. epicgames. com/Unreal. Script. htm Directly support game concepts n ¡ High level of abstraction n ¡ Objects and interactions, not bits and pixels Programming simplicity n n Actors, events, duration, networking OO, error checking, GC, sandboxing Several architectures were explored and discarded ¡ ¡ Java: too slow (Java 1. 1) VB-based language: C programmers didn’t like it

Unreal. Script n Looks like Java ¡ ¡ n Game-specific features ¡ n Java-like

Unreal. Script n Looks like Java ¡ ¡ n Game-specific features ¡ n Java-like syntax Classes, methods, inheritance States, networking Runs in a framework ¡ ¡ Game engine sends events to objects Objects call game engine for services

Actor States void spoken. To(Speaker s) { if (state == ANGRY) { shoot. At(s);

Actor States void spoken. To(Speaker s) { if (state == ANGRY) { shoot. At(s); } else { say. Hi(s); } } state angry { begin: say(“Raaaaaaargh!!!”); void spoken. To(Speaker s) { shoot. At(s); } } void bumps. Into(Object obj) { back. Up(); say(“Raaaaaaargh!!!”); state = ANGRY; } // And what about inheritance? void bumps. Into(Object obj) { back. Up(); Goto. State(‘angry’); } void spoken. To(Speaker s) { say. Hi(s); }

Networking n Unreal network architecture ¡ ¡ ¡ Server maintains simulation objects Client also

Networking n Unreal network architecture ¡ ¡ ¡ Server maintains simulation objects Client also maintains simulation objects Server replicates simulation objects to client n ¡ Client also predicts object changes n n Sends copies of as many objects as bandwidth allows Hides latency Language Support ¡ ¡ simulated keyword Indicates a function that can run in client prediction

Errors in Unreal. Script n Static checking ¡ Unreal. Script supports traditional static checking

Errors in Unreal. Script n Static checking ¡ Unreal. Script supports traditional static checking n ¡ ¡ n Just like PA 4, PA 5, Java Name checking Type checking Dynamic techniques

Dynamic Error Handling n Null pointer dereference ¡ Not a problem! n ¡ ¡

Dynamic Error Handling n Null pointer dereference ¡ Not a problem! n ¡ ¡ n Or not a bad problem, anyway Raise an exception, return to framework One event fails, the system survives Infinite loops and infinite recursion ¡ ¡ Hard for game engine to recover from singular function declaration n n Means “don’t recur into me” Declare bugs out of existence

Performance n Implementation ¡ n Compiles to VM bytecode (like Java) Performance ¡ 20

Performance n Implementation ¡ n Compiles to VM bytecode (like Java) Performance ¡ 20 x slower than C n n ¡ ¡ ¡ Ugh! Even Java is only 2 -4 x slower. But wait… Even with 100 s of objects CPU spends only 5% time running Unreal. Script Engine does most of the work Doesn’t need to be fast

Implementation Quality Interpreter PA 1, CBL, RFL Interpreter Execution Speed, Development Effort Bytecode Interpreter

Implementation Quality Interpreter PA 1, CBL, RFL Interpreter Execution Speed, Development Effort Bytecode Interpreter Unreal. Script, Java 1. 0 Basic Compiler PA 5 Optimizing Compiler RFL Compiler Fancy Optimizing Compiler Java 1. 5 Hot. Spot VM, gcc, PA 6

The Unreal Engine n Why was it so successful? ¡ n Many reasons From

The Unreal Engine n Why was it so successful? ¡ n Many reasons From a language point of view ¡ Domain-specific concepts n ¡ Based on existing languages n ¡ Easy to use Easy to learn Runs slow n Easy to implement

Language Flexibility Configuration languages Report Format Language Flexibility, Maintenance Effort Little languages CBL: easy

Language Flexibility Configuration languages Report Format Language Flexibility, Maintenance Effort Little languages CBL: easy development and maintenance DSLs Unreal. Script: high abstraction, easy to use GPLs Perl: high abstraction, years of development

Implementation Quality Interpreter CBL: quick development Execution Speed, Development Effort Bytecode Interpreter Unreal. Script:

Implementation Quality Interpreter CBL: quick development Execution Speed, Development Effort Bytecode Interpreter Unreal. Script: slow language, fast library Basic Compiler PA 5 Optimizing Compiler Report Format Lang: a few simple optimizations Fancy Optimizing Compiler Java 1. 4 Hot. Spot VM, gcc

Usability Author Usability, Language Design Effort Sometimes appropriate: PA 5 tools Hackers X configuration

Usability Author Usability, Language Design Effort Sometimes appropriate: PA 5 tools Hackers X configuration scripts Programmers CBL Users Unreal. Script: high level, based on popular languages

Creating Your Own Language n CS 164 ¡ ¡ n Report Format Language ==

Creating Your Own Language n CS 164 ¡ ¡ n Report Format Language == PA 1 CBL == PA 1 -PA 3 Unreal. Script == PA 1 -PA 5 You have more than enough skills! Hard part is language design ¡ ¡ Requires experience So create some languages!

Getting Started n Language Design ¡ ¡ ¡ n Factor out differences from stereotypical

Getting Started n Language Design ¡ ¡ ¡ n Factor out differences from stereotypical code Base on existing languages Extensibility is good Implementation ¡ ¡ Interpreter Compiler n n Compile to HLL: C, Java bytecodes, CLI Libraries ¡ ¡ Easy to make fast Good libraries make a language popular n Java, Perl, Python