Introduction to Compiling CSCI 327 Computer Science Theory












- Slides: 12
Introduction to Compiling CSCI 327 Computer Science Theory
Why should I care? v A compiler is a translator. Lots of other computing problems also involve translating complex data from one format to another. v If you understand how a compiler works, then you can be a better compiler user.
lexical analyzer Phases source code syntax analyzer semantic analyzer Symbol Table intermediate code gen code optimizer code generator machine code
Preprocessor v Handle all the precompile tasks such as #include
characters Lexical Analysis tokens § converts characters into tokens § The input "A = B + 90; " is converted into • • • id : A assignment_operator id : B addition_operator integer : 90 semicolon § Based on regular expressions
tokens Syntax Analysis parse tree § converts tokens into a parse tree § based on Context Free Grammar § attempts to recover from errors § begins to build the symbol table § adds A and B as variables, but types are not yet known. § There is no need to progress into Semantic Analysis if there were syntax errors.
parse tree Semantic Analysis augmented parse tree symbol table Semantic Analysis semantics = meaning int A, B; A = B + 90; § can B be added with 90? o yes, if B is an int or float o no, if B is a string, etc stmt id equal expr op id int-add § what type of addition is that? o int add ≠ float add § can A be assigned the result of that operation? § No need to proceed if there are semantic errors. expr num
Symbol Table Name Type other attributes A int B int "Hello World" string constant used once sort null function
augmented parse tree Intermediate Code Gen 3 addr code Intermediate Code Generation Three address code is easy to optimize compared to assembly. Source code of float A, B, C; A = B + C * 90; Yields 3 address code of temp 1 = (float) 90; temp 2 = C * temp 1; A = B + temp 2;
Optimization Input of temp 1 = (float) 90; temp 2 = C * temp 1; A = B + temp 2; § temp 1 and temp 2 should probably be in registers § If the 90. 0 gets used again soon, then try to save the contents of that register. § move adjacent independent stmts closer to other appearances
Code Generation v output could be either relocatable machine code or assembly code v Plug the input A = B + temp 2 into a template to generate MOV R 1, B ADD R 1, R 2 MOV A, R 1 move B into register R 1 add temp 2 to R 1 move result of operation to memory
lexical analyzer Phases source code syntax analyzer semantic analyzer Symbol Table intermediate code gen code optimizer code generator machine code