1 Code Generation Part I Chapter 8 1

  • Slides: 15
Download presentation
1 Code Generation Part I Chapter 8 (1 st ed. Ch. 9) COP 5621

1 Code Generation Part I Chapter 8 (1 st ed. Ch. 9) COP 5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2007 -2017

2 Position of a Code Generator in the Compiler Model Source program Front-End Intermediate

2 Position of a Code Generator in the Compiler Model Source program Front-End Intermediate code Code Optimizer Intermediate code Lexical error Syntax error Semantic error Symbol Table Code Generator Target program

3 Code Generation • Code produced by compiler must be correct – Source-to-target program

3 Code Generation • Code produced by compiler must be correct – Source-to-target program transformation should be semantics preserving • Code produced by compiler should be of high quality – Effective use of target machine resources – Heuristic techniques should be used to generate good but suboptimal code, because generating optimal code is undecidable

4 Target Program Code • The back-end code generator of a compiler may generate

4 Target Program Code • The back-end code generator of a compiler may generate different forms of code, depending on the requirements: – Absolute machine code (executable code) – Relocatable machine code (object files for linker) – Assembly language (facilitates debugging) – Byte code forms for interpreters (e. g. JVM)

5 The Target Machine • Implementing code generation requires thorough understanding of the target

5 The Target Machine • Implementing code generation requires thorough understanding of the target machine architecture and its instruction set • Our (hypothetical) machine: – Byte-addressable (word = 4 bytes) – Has n general purpose registers R 0, R 1, …, Rn-1 – Two-address instructions of the form op source, destination

6 The Target Machine: Op-codes and Address Modes • Op-codes (op), for example MOV

6 The Target Machine: Op-codes and Address Modes • Op-codes (op), for example MOV (move content of source to destination) ADD (add content of source to destination) SUB (subtract content of source from dest. ) • Address modes Mode Memory Address M Added Cost Absolute Form M Register R 0 Indexed c(R) N/A c+contents(R) Indirect register *R contents(R) 0 Indirect indexed *c(R) contents(c+contents(R)) #c Literal N/A 1 1

7 Instruction Costs • Machine is a simple, non-super-scalar processor with fixed instruction costs

7 Instruction Costs • Machine is a simple, non-super-scalar processor with fixed instruction costs • Realistic machines have deep pipelines, I-cache, D -cache, etc. • Define the cost of instruction = 1 + cost(source-mode) + cost(destination-mode)

8 Examples Instruction MOV R 0, R 1 MOV R 0, M MOV M,

8 Examples Instruction MOV R 0, R 1 MOV R 0, M MOV M, R 0 MOV 4(R 0), M MOV *4(R 0), M MOV #1, R 0 ADD 4(R 0), *12(R 1) Operation Store content(R 0) into register R 1 Store content(R 0) into memory location M Store content(M) into register R 0 Store contents(4+contents(R 0)) into M Store contents(4+contents(R 0))) into M Store 1 into R 0 Add contents(4+contents(R 0)) to value at location contents(12+contents(R 1)) Cost 1 2 2 3 3 2 3

9 Instruction Selection • Instruction selection is important to obtain efficient code • Suppose

9 Instruction Selection • Instruction selection is important to obtain efficient code • Suppose we translate three-address code x: =y+z to: MOV y, R 0 ADD z, R 0 MOV R 0, x a: =a+1 Better ADD #1, a Cost = 3 Best INC a Cost = 2 MOV a, R 0 ADD #1, R 0 MOV R 0, a Cost = 6

10 Instruction Selection: Utilizing Addressing Modes • Suppose we translate a: =b+c into MOV

10 Instruction Selection: Utilizing Addressing Modes • Suppose we translate a: =b+c into MOV b, R 0 ADD c, R 0 MOV R 0, a • Assuming addresses of a, b, and c are stored in R 0, R 1, and R 2 MOV *R 1, *R 0 ADD *R 2, *R 0 • Assuming R 1 and R 2 contain values of b and c ADD R 2, R 1 MOV R 1, a

11 Need for Global Machine. Specific Code Optimizations • Suppose we translate three-address code

11 Need for Global Machine. Specific Code Optimizations • Suppose we translate three-address code x: =y+z to: MOV y, R 0 ADD z, R 0 MOV R 0, x • Then, we translate to: a: =b+c d: =a+e MOV ADD MOV b, R 0 c, R 0, a a, R 0 e, R 0, d Redundant

12 Register Allocation and Assignment • Efficient utilization of the limited set of registers

12 Register Allocation and Assignment • Efficient utilization of the limited set of registers is important to generate good code • Registers are assigned by – Register allocation to select the set of variables that will reside in registers at a point in the code – Register assignment to pick the specific register that a variable will reside in • Finding an optimal register assignment in general is NP-complete

13 Example t: =a*b t: =t+a t: =t/d { R 1=t } MOV MUL

13 Example t: =a*b t: =t+a t: =t/d { R 1=t } MOV MUL ADD DIV MOV a, R 1 b, R 1 a, R 1 d, R 1, t { R 0=a, R 1=t } MOV MUL ADD DIV MOV a, R 0, R 1 b, R 1 R 0, R 1 d, R 1, t

14 Choice of Evaluation Order • When instructions are independent, their evaluation order can

14 Choice of Evaluation Order • When instructions are independent, their evaluation order can be changed a+b-(c+d)*e t 1: =a+b t 2: =c+d t 3: =e*t 2 t 4: =t 1 -t 3 reorder t 2: =c+d t 3: =e*t 2 t 1: =a+b t 4: =t 1 -t 3 MOV ADD MOV MUL MOV SUB MOV a, R 0 b, R 0, t 1 c, R 1 d, R 1 e, R 0 R 1, R 0 t 1, R 1 R 0, R 1, t 4 MOV ADD MOV MUL MOV ADD SUB MOV c, R 0 d, R 0 e, R 1 R 0, R 1 a, R 0 b, R 0 R 1, R 0, t 4

15 Generating Code for Stack Allocation of Activation Records t 1 : = param

15 Generating Code for Stack Allocation of Activation Records t 1 : = param t 2 : = … a + b t 1 c call foo, 2 func foo … return t 1 100: 108: 116: 124: 132: 140: 148: 156: 164: 172: ADD #16, SP MOV a, R 0 ADD b, R 0 MOV R 0, 4(SP) MOV c, 8(SP) MOV #156, *SP GOTO 500 MOV 12(SP), R 0 SUB #16, SP … Push frame Store a+b Store c Store return address Jump to foo Get return value Remove frame 500: … 564: MOV R 0, 12(SP) Store return value 572: GOTO *SP Return to caller Note: Language and machine dependent Here we assume C-like implementation with SP and no FP