Code Generation Compiler Baojian Hua bjhuaustc edu cn
Code Generation Compiler Baojian Hua bjhua@ustc. edu. cn
Front End source code lexical analyzer tokens parser abstract syntax tree semantic analyzer IR
Back End IR instruction selector Assem register allocator Temp. Map instruction scheduler Assem
Code Generation n Generating code for some ISA n n Many components n n this course uses x 86 instruction selection, register allocation, scheduling, … Many different strategies n n for this time, we concentrate on a simple one: stack machine and later in this course, we’d turn to more advanced (and sophisticated) ones
What’s a stack machine? n A stack machine has only an operand stack and no (or few) registers n n n all computation performed on the operand stack architecture very simple and uniform Long history: n n Date back at least to 70’s last century Renew industry’s interest in the recent decade n Sun’s JVM and Microsoft’s CLR, etc.
Stack Machine ISA: s 86 prog -> -> instr -> -> -> v -> -> instr prog push v pop id add sub times divide num id // Sample Program push 8 push 2 push x times sub
The simple expression lang’ // recall our simple // expression language exp -> num -> id -> exp + exp -> exp – exp -> exp * exp -> exp / exp -> (exp) // or in ML datatype exp = Int of int | Id of string | Add of exp * exp | Sub of exp * exp | Times of exp * exp | Divide of exp * exp // Sample Program 8 -2*x
Code gen’ from exp to s 86 C C C (num) (id) (e 1 + (e 1 – (e 1 * (e 1 / = push num = push id e 2) = C (e 1); C C (e 2); add sub times divide
Code gen’ from exp to s 86 // or in ML fun C (e) = case e of Num i => push i | Id s => push s | Add (e 1, e 2) => C (e 1); C (e 2); add | … => (* similar *)
Example C (8 -2*x) = C(8); C(2*x); sub = push 8; C(2); C(x); times; sub = …
Moral n Code generation for stack machine is dirty simple n n recursive equation from point view of math recursive function from point view of CS think before hack! But we’d have more to say about: n n variable storage more language features n statement, declarations, functions, etc. .
Address space n Address space is the way how programs use memory n n highly architecture and OS dependent right is the typical layout of 32 -bit x 86/Linux OS 0 xffff 0 xc 0000 stack heap data text BIOS, VGA 0 x 08048000 0 x 00100000 0 x 0000
Static Storage n Static storage is an area of space in data section n n a typical use is to hold C/C++ file scope variables (static) and extern variable (global) Exp lang’ has only static variables, all can be stored to static section n so require a pass to collect all variables
Declarations // scale exp a bit prog -> decs exp decs -> int id; decs -> exp -> … // or in ML datatype decs = T of {var: string, ty: tipe} list // Sample Program int x; 8 -2*x;
Code gen’ rules D (int id; decs) = id: . int 0 D (decs) D ( ) =
Statement // scale the exp a by adding the following: s -> id = e; -> if (e) s else s // compile: CS (id = e; ) = C (e); pop id
Statement, cont’ // s 86 should also be modified! // compile: CS (if e s 1 s 2) = C(e); jz. Lfalse s 1. Ltrue: CS(s 1) jmp. Lend. Lfalse: CS(s 2). Lend e s 2 …
Moral n It’s also straightforward to translate other control structure in this style n n while, for, switch, etc. . This kind of code generation is called recursive decedent n n may be done at parsing time adopted in many compilers n read the offered article on Borland Turbo Pascal 3. 0 n you may safely ignore the Pascal-specific features
From s 86 to x 86 n Run the generated s 86 code? n design a virtual machine n n n translate to native code and then exec’ it n n n as we did in lab #1 this is also the way of JVM or CLR so-called just-in-time (JIT) the dominant OO method today… Next, we discuss the 2 nd method n by mapping s 86 to x 86
Operand Stack // x 86 does not have a dedicated operand stack? // Solution 1: use the control stack: ebp, esp // leave to you. // Solution 2: make a fake operand stack, as in: . set PAGE, 4096. data op. Stack: . space PAGE, 0 xcc top: . int op. Stack+PAGE // “top” points to stack top, and stack grows // down to lower address
Instructions // map fake s 86 instructions to x 86’s: . macro s 86 push x sub dword ptr [top], 4 mov ebx, [top] mov eax, x mov [ebx], eax. endm // others are similar. // Care must be taken to take account of the // machine constraints. For instance, mem-mem // move is illegal on x 86.
- Slides: 21