Modern Compilers Modern Compilers l Compilers have not

  • Slides: 34
Download presentation
Modern Compilers

Modern Compilers

Modern Compilers l Compilers have not changed a great deal since the days of

Modern Compilers l Compilers have not changed a great deal since the days of Backus. They still consist of two main components: l The front-end reads in the program in the source languages, makes sense of it, and stores it in an internal representation…

And the back-end, which converts the internal representation into the target language, perhaps with

And the back-end, which converts the internal representation into the target language, perhaps with optimizations. The target language used is typically an assembly language, but it is often easier to use a more established, higher-level language.

Source Language Structure of a Compiler ? Target Language

Source Language Structure of a Compiler ? Target Language

Source Language Front End Structure of a Compiler Intermediate Code Back End Target Language

Source Language Front End Structure of a Compiler Intermediate Code Back End Target Language

Source Language Lexical Analyzer Syntax Analyzer Semantic Analyzer Structure of a Compiler Int. Code

Source Language Lexical Analyzer Syntax Analyzer Semantic Analyzer Structure of a Compiler Int. Code Generator Intermediate Code Back End Target Language Front End

Source Language Lexical Analyzer Syntax Analyzer Semantic Analyzer Structure of a Compiler Front End

Source Language Lexical Analyzer Syntax Analyzer Semantic Analyzer Structure of a Compiler Front End Int. Code Generator Intermediate Code Optimizer Target Code Generator Target Language Back End

Lexical Analysis In a compiler linear analysis is called lexical analysis or scanning. for

Lexical Analysis In a compiler linear analysis is called lexical analysis or scanning. for example in lexical analysis the characters in the assignment statement. Position =initial+Rate*60

Lexical Analysis (cont’d) Would be grouped into the following tokens. 1) The identification position

Lexical Analysis (cont’d) Would be grouped into the following tokens. 1) The identification position 2) The assignment symbol: = 3) The identifier initial 4) The plus sign 5) The identifier rate 6) The multiplication sign 7) The number 60

Lexical Analysis (cont’d) The blanks separating the characters of these tokens would normally be

Lexical Analysis (cont’d) The blanks separating the characters of these tokens would normally be eliminated during lexical analysis.

Syntax Analysis Hierarchical analysis is called parsing or syntax analysis. It involves grouping the

Syntax Analysis Hierarchical analysis is called parsing or syntax analysis. It involves grouping the token of the source program into grammatical phrases that are used by the compiler to synthesize the output. Usually the grammatical phrases of the source program are represented by a parse tree.

Semantic Analysis The semantic analysis phase checks the source programme for semantic errors and

Semantic Analysis The semantic analysis phase checks the source programme for semantic errors and gathers type information for the subsequent code generation phase. it uses the hierarchical structure determined by the syntax analysis phase to identify the operators and operands of expressions and statements.

Semantic Analysis (cont’d) An important component of semantic analysis is type checking hence the

Semantic Analysis (cont’d) An important component of semantic analysis is type checking hence the compiler checks that each operator has operands that are permitted by the source language specification. For example many programming language definitions require a compiler to report an error every time a real number is used to index an array

Intermediate Code Generation After syntax and semantic analysis some compilers generate an explicit intermediate

Intermediate Code Generation After syntax and semantic analysis some compilers generate an explicit intermediate representation of the source program this intermediate representation should have two important properties it should be easy to produce and easy to translate into the target programme.

Intermediate Code Generation (cont’d) We consider an intermediate form called Three address code which

Intermediate Code Generation (cont’d) We consider an intermediate form called Three address code which is like the assembly language for a machine in which every memory location can act like a register. “These address code consists of a sequence of instructions each of which has at most three operands. The source programme in three address code is:

Intermediate Code Generation (cont’d) l l Temp 1 = Into real (60) Temp 2

Intermediate Code Generation (cont’d) l l Temp 1 = Into real (60) Temp 2 =Id 3 * Temp 1 Temp 3 = Id 2+ Temp 2 Id 1 = Temp 3

Intermediate Code Generation (cont’d) This intermediate form has several properties ; Each three address

Intermediate Code Generation (cont’d) This intermediate form has several properties ; Each three address instruction has at most one operator in addition to the assignment thus when generating these instructions the compiler has to decide on the order in which operations are to be done; the multiplication precede the addition in source program.

Intermediate Code Generation (cont’d) The compiler must generate a temporary name to hold the

Intermediate Code Generation (cont’d) The compiler must generate a temporary name to hold the value computed by each instruction. Some “Three address” instruction have fewer than three operands e. g. the first and last instruction.

Code Optimization The code optimization function phase attempts to improve the intermediate code so

Code Optimization The code optimization function phase attempts to improve the intermediate code so that faster running machine code will result. Some optimizations are trivial (of very little importance) for example a natural algorithm generates the intermediate code using an instruction for each operator in the tree representation after semantic analysis even though there is a better way to perform the same calculation using two instructions

Code Optimization (cont’d) Temp 1 = id 3 *60. 0 Id 1 = id

Code Optimization (cont’d) Temp 1 = id 3 *60. 0 Id 1 = id 2 + temp 1

Code Generation The final phase of the compiler is the generation of target code

Code Generation The final phase of the compiler is the generation of target code constructing normally of relocatable m/c code or assembly code memory locations are selected for each variables used by the program. A crucial aspect is the assignment of variables to registers.

Code Generation (cont’d) For example using registers 1 and 2 the translation of the

Code Generation (cont’d) For example using registers 1 and 2 the translation of the code might become MOVF R 2, id 3 MULF R 2, # 60. 0 MOVF R 1, id 2 ADDF R 1, R 2 MOVF ID 1, R 1

Code Generation (cont’d) The first and the 2 nd operands of each instruction specify

Code Generation (cont’d) The first and the 2 nd operands of each instruction specify a source and destination respectively. The F in each instruction tells us that instruction deal with floating point numbers.

Source Language Lexical Analyzer Syntax Analyzer Semantic Analyzer Int. Code Generator Intermediate Code Optimizer

Source Language Lexical Analyzer Syntax Analyzer Semantic Analyzer Int. Code Generator Intermediate Code Optimizer Target Code Generator Target Language Example Compilation Source Code: Position = Initial + Rate * 60

Source Language Example Compilation Lexical Analyzer Syntax Analyzer Semantic Analyzer Int. Code Generator Intermediate

Source Language Example Compilation Lexical Analyzer Syntax Analyzer Semantic Analyzer Int. Code Generator Intermediate Code Optimizer Target Code Generator Target Language Source Code: Position = Initial + Rate * 60 Lexical Analysis: ID(1) ASSIGN ID(2) ADD ID(3) MULT INT(60)

Example Compilation Source Language Lexical Analyzer Syntax Analyzer Semantic Analyzer Source Code: Position =

Example Compilation Source Language Lexical Analyzer Syntax Analyzer Semantic Analyzer Source Code: Position = Initial + Rate * 60 Int. Code Generator Lexical Analysis: ID(1) ASSIGN ID(2) ADD ID(3) MULT INT(60) Intermediate Code Optimizer Target Code Generator Target Language Syntax Analysis: ASSIGN ID(1) ADD ID(2) MULT ID(3) INT(60)

Example Compilation Source Language Lexical Analyzer Syntax Analysis: ASSIGN Semantic Analyzer ID(1) Int. Code

Example Compilation Source Language Lexical Analyzer Syntax Analysis: ASSIGN Semantic Analyzer ID(1) Int. Code Generator Intermediate Code ADD ID(2) MULT ID(3) INT(60) Sematic Analysis: ASSIGN Code Optimizer Target Code Generator Target Language ID(1) ADD ID(2) MULT ID(3) int 2 real INT(60)

Example Compilation Source Language Lexical Analyzer Syntax Analyzer Semantic Analyzer Int. Code Generator Sematic

Example Compilation Source Language Lexical Analyzer Syntax Analyzer Semantic Analyzer Int. Code Generator Sematic Analysis: ASSIGN ID(1) ADD ID(2) Intermediate Code MULT ID(3) INT(60) Code Optimizer Target Code Generator Target Language int 2 real Intermediate Code: temp 1 temp 2 temp 3 id 1 = = int 2 real(60) = id 3 * temp 1 = id 2 + temp 2 temp 3

Source Language Example Compilation Lexical Analyzer Syntax Analyzer Semantic Analyzer Int. Code Generator Intermediate

Source Language Example Compilation Lexical Analyzer Syntax Analyzer Semantic Analyzer Int. Code Generator Intermediate Code Optimizer Target Code Generator Target Language Intermediate Code: temp 1 = int 2 real(60) temp 2 = id 3 * temp 1 temp 3 = id 2 + temp 2 id 1 = temp 3 Optimized Code (step 0): temp 1 = int 2 real(60) temp 2 = id 3 * temp 1 temp 3 = id 2 + temp 2 id 1 = temp 3

Source Language Example Compilation Lexical Analyzer Syntax Analyzer Semantic Analyzer Int. Code Generator Intermediate

Source Language Example Compilation Lexical Analyzer Syntax Analyzer Semantic Analyzer Int. Code Generator Intermediate Code Optimizer Target Code Generator Target Language Intermediate Code: temp 1 = int 2 real(60) temp 2 = id 3 * temp 1 temp 3 = id 2 + temp 2 id 1 = temp 3 Optimized Code (step 1): temp 1 = 60. 0 temp 2 = id 3 * temp 1 temp 3 = id 2 + temp 2 id 1 = temp 3

Source Language Example Compilation Lexical Analyzer Syntax Analyzer Semantic Analyzer Int. Code Generator Intermediate

Source Language Example Compilation Lexical Analyzer Syntax Analyzer Semantic Analyzer Int. Code Generator Intermediate Code Optimizer Target Code Generator Target Language Intermediate Code: temp 1 = int 2 real(60) temp 2 = id 3 * temp 1 temp 3 = id 2 + temp 2 id 1 = temp 3 Optimized Code (step 2): temp 2 = id 3 * 60. 0 temp 3 = id 2 + temp 2 id 1 = temp 3

Source Language Example Compilation Lexical Analyzer Syntax Analyzer Semantic Analyzer Int. Code Generator Intermediate

Source Language Example Compilation Lexical Analyzer Syntax Analyzer Semantic Analyzer Int. Code Generator Intermediate Code Optimizer Target Code Generator Target Language Intermediate Code: temp 1 = int 2 real(60) temp 2 = id 3 * temp 1 temp 3 = id 2 + temp 2 id 1 = temp 3 Optimized Code (step 3): temp 2 = id 3 * 60. 0 id 1 = id 2 + temp 2

Source Language Example Compilation Lexical Analyzer Syntax Analyzer Semantic Analyzer Int. Code Generator Intermediate

Source Language Example Compilation Lexical Analyzer Syntax Analyzer Semantic Analyzer Int. Code Generator Intermediate Code Optimizer Target Code Generator Target Language Intermediate Code: temp 1 = int 2 real(60) temp 2 = id 3 * temp 1 temp 3 = id 2 + temp 2 id 1 = temp 3 Optimized Code: temp 1 = id 3 * 60. 0 id 1 = id 2 + temp 1

Example Compilation Source Language Lexical Analyzer Syntax Analyzer Semantic Analyzer Int. Code Generator Intermediate

Example Compilation Source Language Lexical Analyzer Syntax Analyzer Semantic Analyzer Int. Code Generator Intermediate Code Optimizer Target Code Generator Target Language Intermediate Code: temp 1 temp 2 temp 3 id 1 = = int 2 real(60) = id 3 * temp 1 = id 2 + temp 2 temp 3 Optimized Code: temp 1 = id 3 * 60. 0 id 1 = id 2 + temp 1 Target Code: MOVF R 2, id 3 MULF R 2, #60. 0 MOVF R 1, id 2 ADDF R 1, R 2 MOVF id 1, R 1