Compiler TH 6 7 8 DTH 102 cwhsuehcsie
- Slides: 28
Compiler TH 6 7 8, DTH 102 薛智文 cwhsueh@csie. ntu. edu. tw http: //www. csie. ntu. edu. tw/~cwhsueh/ 96 Spring /27 國立台灣大學 資訊 程學系
Why Study Compilers? 1. Excellent software-engineering example --theory meets practice. 2. Essential software tool. 3. Influences hardware design, e. g. , RISC, VLIW. 4. Tools (mostly “optimization”) for enhancing software reliability and security. 9/5/2021 2 /27 資 系網媒所 NEWS實驗室
Compilers & Architecture Modern architectures have very complex structures, especially opportunities for parallel execution. Sequential programs can only make effective use of these features via an optimizing compiler. Hardware question: If we implemented this, could a compiler use it? 9/5/2021 3 /27 資 系網媒所 NEWS實驗室
Software Reliability Optimization technology (data-flow analysis) used in: Lock/unlock errors. Buffers not range-checked. Memory Leaks. SQL injection bugs. . 9/5/2021 4 /27 資 系網媒所 NEWS實驗室
What this Course Offers? Compiler methodology for both compiler implementation and related applications. Theoretical framework. Key algorithms. Hands-on experience. Nongoal: build a complete optimizing compiler. 9/5/2021 5 /27 資 系網媒所 NEWS實驗室
Course Outline Part 1 --- Introduction. Part 2 --- Scanner. Part 3 --- Parser. Part 4 --- Syntax-Directed Translation. Part 5 --- Symbol Table. Part 6 --- Intermediate Code Generation. Part 7 --- Run Time Storage Organization. Part 8 --- Optimization. Part 9 --- How to Write a Compiler. Part 10 --- A Simple Code Generation (PSEUDO) Example. 9/5/2021 6 /27 資 系網媒所 NEWS實驗室
Introduction Compiler is one of language processors. source program Compiler input 9/5/2021 target program output 7 /27 資 系網媒所 NEWS實驗室
What is a Compiler? Definitions: The software system translates description of computations into a program executable by a computer. Source and target must be equivalent! Compiler writing spans: programming languages; machine architecture; language theory; algorithms and data structures; input software engineering. History: source program Compiler target program output 1950: the first FORTRAN compiler took 18 man-years; now: using software tools, can be done in a few months as a student’s project. 9/5/2021 8 /27 資 系網媒所 NEWS實驗室
An Interpreter source program input 9/5/2021 Interpreter output 9 /27 資 系網媒所 NEWS實驗室
A Hybrid Compiler source program Translator intermediate program input 9/5/2021 Virtual Machine output 10 /27 資 系網媒所 NEWS實驗室
A Language-Processing System source program Preprocessor modified source program Compiler target assembly program Assembler relocatable machine code Linker/Loader target machine code 9/5/2021 library files relocatable object files 11 /27 資 系網媒所 NEWS實驗室
Applications Computer language compilers. Translator: from one format to another. query interpreter text formatter silicon compiler infix notation postfix notation: pretty printers 3+5– 6*6 ··· 35+66*– Software productivity tools. 9/5/2021 12 /27 資 系網媒所 NEWS實驗室
Relations with Computational Theory Computational theory: a set of grammar rules ≡ the definition of a particular machine. also equivalent to a set of languages recognized by this machine. a type of machines: a family of machines with a given set of operations, or capabilities; power of a type of machines ≡ the set of languages that can be recognized by this type of machines. 9/5/2021 13 /27 資 系網媒所 NEWS實驗室
A Language-Processing System source program Preprocessor modified source program Compiler target assembly program Assembler relocatable machine code Linker/Loader target machine code 9/5/2021 library files relocatable object files 14 /27 資 系網媒所 NEWS實驗室
Phases of a Compiler character stream Lexical Analyzer (scanner) token stream Syntax Analyzer (parser) abstract-syntax tree Symbol Table Semantic Analyzer annotated abstract-syntax tree Error Handling Intermediate Code Generator intermediate representation Machine-Independent Code Optimizer optimized intermediate representation Code Generator relocatable machine code Machine-Dependent Code Optimizer 9/5/2021 target-machine code 15 /27 資 系網媒所 NEWS實驗室
Lexical Analyzer (Scanner) Actions: Reads characters from the source program; Groups characters into lexemes , i. e. , sequences of characters that “go together”, following a given pattern ; Each lexeme corresponds to a token. the scanner returns the next token, plus maybe some additional information, to the parser; The scanner may also discover lexical errors, i. e. , erroneous characters. The definitions of what a lexeme, token or bad character is depend on the definition of the source language. 9/5/2021 16 /27 資 系網媒所 NEWS實驗室
Scanner Example for C Symbol Table Lexeme: C statement position = initial + rate * 60; 1 position … 2 initial … 3 rate … (Lexeme) position = initial + rate * 60 ; < id, 1> <=> <id, 2> <+> <id, 3> <*> <60> <; > (Token) ID ASSIGN ID PLUS ID TIME INT SEMI-COL Arbitrary number of blanks (white spaces) between lexemes. Erroneous sequence of characters, that are not parts of comments, for the C language: control characters @ 2 abc 9/5/2021 17 /27 資 系網媒所 NEWS實驗室
Syntax Analyzer (Parser) Actions: Group tokens into grammatical phrases , to discover the underlying structure of the source Find syntax errors , e. g. , the following C source line: (Lexeme) index = 12 * ; (Token) ID ASSIGN INT TIMES SEMI-COL Every token is legal, but the sequence is erroneous! May find some static semantic errors , e. g. , use of undeclared variables or multiple declared variables. May generate code, or build some intermediate representation of the source program, such as an abstract-syntax tree. 9/5/2021 18 /27 資 系網媒所 NEWS實驗室
Parser Example for C Source code: position = initial + rate * 60 < id, 1> <=> <id, 2> <+> <id, 3> <*> <60> Abstract-syntax tree: = Symbol Table + < id, 1> <id, 2> * <id, 3> 60 1 position … 2 initial … 3 rate … interior nodes of the tree are OPERATORS; a node’s children are its OPERANDS; each subtree forms a logical unit. the subtree with * at its root shows that * has higher precedence than +, the operation “rate * 60” must be performed as a unit, not “initial + rate”. Where is ”; ”? 9/5/2021 19 /27 資 系網媒所 NEWS實驗室
Semantic Analyzer Actions: Check for more static semantic errors, e. g. , type errors. May annotate and/or change the abstract syntax tree. = < id, 1> <id, 2> = + < id, 1> * <id, 3> 60 <id, 2> Symbol Table 1 position … 2 initial … 3 rate … + * <id, 3> int_to_float 60 9/5/2021 20 /27 資 系網媒所 NEWS實驗室
Intermediate Code Generator Actions: translate from abstract-syntax trees to intermediate codes. One choice for intermediate code is 3 -address code : Each statement contains at most 3 operands; in addition to “=”, i. e. , assignment, at most one operator. An ”easy” and “universal” format that can be translated into most assembly languages. = < id, 1> <id, 2> + * <id, 3> int_to_float 60 9/5/2021 t 1 = int_to_float(60) t 2 = id 3 * t 1 t 3 = id 2 + t 2 id 1 = t 3 21 /27 資 系網媒所 NEWS實驗室
Optimizer Improve the efficiency of intermediate code. Goal may be to make code run faster , and/or to use the least number of registers · · · t 1 = int_to_float(60) t 2 = id 3 * t 1 t 3 = id 2 + t 2 id 1 = t 3 t 1 = id 3 * 60. 0 id 1 = id 2 + t 1 Current trends: to obtain smaller, but maybe slower, equivalent code for embedded systems; to reduce power consumption. 9/5/2021 22 /27 資 系網媒所 NEWS實驗室
Code Generation A compiler may generate pure machine codes (machine dependent assembly language) directly, which is rare now ; virtual machine code. Example: PASCAL compiler P-code interpreter execution Speed is roughly 4 times slower than running directly generated machine codes. Advantages: simplify the job of a compiler; decrease the size of the generated code: 1/3 for P-code ; can be run easily on a variety of platforms P-machine is an ideal general machine whose interpreter can be written easily; divide and conquer; recent example: JAVA and Byte-code. 9/5/2021 23 /27 資 系網媒所 NEWS實驗室
Code Generation Example t 1 = id 3 * 60. 0 id 1 = id 2 + t 1 9/5/2021 LDF R 2, id 3 MULF R 2, #60. 0 LDF R 1, id 2 ADDF R 1, R 2 STF id 1, R 1 24 /27 資 系網媒所 NEWS實驗室
Practical Considerations (1/2) Preprocessing phase: macro substitution: #define MAXC 10 rational preprocessing: add new features for old languages. BASIC C C ++ compiler directives: #include <stdio. h> non-standard language extensions. adding parallel primitives 9/5/2021 25 /27 資 系網媒所 NEWS實驗室
Practical Considerations (2/2) Passes of compiling First pass reads the text file once. May need to read the text one more time for any forward addressed objects, i. e. , anything that is used before its declaration. Example: C language goto error_handling; ··· error_handling: ··· 9/5/2021 26 /27 資 系網媒所 NEWS實驗室
Reduce Number of Passes Each pass takes I/O time. Back-patching : leave a blank slot for missing information, and fill in the empty slot when the information becomes available. Example: C language when a label is used if it is not defined before, save a trace into the to-be-processed table label name corresponds to LABEL TABLE[i] code generated: GOTO LABEL TABLE[i] when a label is defined check known labels for redefined labels if it is not used before, save a trace into the to-be-processed table if it is used before, then find its trace and fill the current address into the trace Time and space trade-off ! 9/5/2021 27 /27 資 系網媒所 NEWS實驗室
- Intelsat stands for
- Satellite tv receiver
- Yet another compiler compiler
- Cross compiler in compiler design
- 102 bc meaning
- Psychology 102 practice test
- Cs 102 midterm
- 102
- Iat 102
- Jmx 102
- Convenio 102
- Aljalal phys 102
- Scanf syntax in c
- Amer rasheed
- Learnability flexibility robustness
- Articolo 102 tfue
- Traveless
- 102
- Electric potential lecture
- 704 marking system
- Nur 102
- Math102 kfupm
- Area 102
- 102 graphic
- Nur 102
- 102 capacitor
- 102 lgx
- Convenio 102 oit
- Clo dashboard