System Software Unit1 Language Processors A TOY Compiler
- Slides: 20
System Software Unit-1 (Language Processors) A TOY Compiler Prepared By : - Bhavin Dalsaniya MEFGI-MCA Studant
v. The Front End �The front end performs � Lexical Analysis � Syntax Analysis � Semantic Analysis of the source program. �Each kind of analysis involves the following functions I. III. Determine validity of a source statement Determine the ‘content’ of a source statement Construct the IC of a source statement for use by subsequent analysis functions.
v‘content’ word �The word ‘content’ has different meaning in laxical, syntax and semantic analysis. �In lexical analysis, the content is the lexical class to which each lexical unit belongs. �In syntax analysis it is the syntactic structure of a source statement. �In semantic analysis the content is the meaning of a statement.
v. After Analysis of ‘content’ �It generates information in form of �Tables of information �Description of the source statement �Subsequent analysis uses this information for its own purpose and either adds information to these tables and description. �For example : - syntax analysis uses the information generated by lexical analysis and construct a representation for the syntactic structure of source statement. �Semantic analysis uses the information generated by syntax analysis and construct representation for the meaning of the statement. �The tables and descriptions at the end of semantic analysis form the IR (Intermediate Representation) of the front end. �Its more clear from the following diagram.
v. Diagram of front end toy compiler Lexical Errors Syntax Errors Semantic Errors Source Program ---------------| | Lexical | | Or | | Scanning | | | Tokens | | | Syntax | | OR | | Parsing | | Trees | | Semantic | | Analysis | | --------------- | IC IR Symbol table, Constant table, Other tables…
1. Lexical Analysis(Scanning) �Lexical analysis identifies the lexical units in a source statement. �It then classifies the units into different lexical classes. �E. g. id’s, constants, reserved id’s etc and enters them into different tables �Lexical analysis builds a descriptor, called a token, for each lexical unit. �A token contains two fields—class code and number in class. �Class code identifies the class to which a lexical unit belongs. �Number in class the entry number of the lexical unit in the relevant table. �We depict a token as Code #no
v. Example : �i : integer �a, b : real �The statement a: =b+i; Id, #2 Op, #5 Id, #3 Op, #3 • Symbol Table N o Symbol Type 1 i int 2 a real 3 b real 4 i* real 5 temp real Length • • Id, #1 Op, #10 Intermediate Code Address 1. Convert (Id, #1) to real , giving (Id, #4) 2. Add (Id, #4) to (Id, #3), giving (Id, #5) 3. Store (Id, #5) in (Id, #2)
2. Syntax Analysis(Parsing) �Syntax analysis processes the string of tokens built by lexical analysis to determine the statement class, e. g. assignment statement, if statement , etc. �It then builds an IC which represents the structure of the statement. �The IC is passed to semantic analysis to determine the meaning of the statement �A tree form is chosen for IC because a tree can represent the hierarchical structure of a PL statement appropriately. a: = b+i; : = real a a + b b i
3. Semantic Analysis �Semantic analysis identifies the sequence of actions necessary to implement the meaning of a source statement �When semantic analysis determines the meaning of a subtree in the IC, it adds information to a table or adds an action to the sequence of actions. �It then modifies the IC to enable further semantic analysis. �The analysis ends when the tree has been completely processed.
v. Example of Semantic Analysis �Source statement a: =b+i; �No of Analysis Steps : Add type II. Right hand side Expression evaluated first in assignment. III. Before Add , perform Conversion int to real IV. Addition operation and store into temp. V. temp store into a. a, real I. � Its more clear from the tree shown in front. : = A) + i, int b, real : = B) a, real + b, real i*, real : = C) a, real temp, real
* The Back End �The back end performs two task as follows �Memory Allocation �Code generation �Memory Allocation : -memory allocation is a simple task given the presence of the symbol table. �The memory requirement of an identifier is computed from its type, length an dimensionality and memory is allocated to it. �The address of the memory area is entered in the symbol table.
Conti…
Conti… �Code Generation : - code generation uses knowledge of the target architecture. . �Knowledge of instruction and addressing modes in the target computer, to select the appropriate instruction. �The important issues in code generation are : �Determine the places where the intermediate results should be kept. either it is in memory location or in machine register. �Determine which instructions should be used for type conversion operation. �Determine which addressing modes should be used for accessing variables.
Conti…
Toy Compiler
v. Programming Language Grammar �A language L can be considered to be a collection of valid sentences. �Each sentences can be looked upon as a sequence of words , and each word as a sequence of letters or graphic symbols acceptable in L. �A Language specified in this manner is known as a “Formal Language”. �Terminal Symbol : �The alphabet of L, denoted by the Greek symbol ∑, is a collection of symbol in its character set. �We will use lower case letters a , b , c , etc. to denote symbols in ∑. �A symbol in the alphabet is known as a terminal symbol (T) of L. �The alphabet can be represented using the mathematical notation of a set , e. g �∑={ a , b , c …. . z, 0, 1, 2 … 9}
Conti… �Here the symbol {, ‘, ’ and} are part of the notation. we call them metasymbols to differentiate them from the terminal symbols. �Strings : �A string is a finite sequence of symbols. we will represent strings by Greek symbols α β γ etc. �α= axy is a string over ∑. �The length of a string is the number of symbols in it. �Note that absence of any symbol is also a string, the null string €. �Concatenation operation combines two strings into single strings.
Conti… �Nonterminal symbols : �A nonterminal symbol (NT) is the name of a syntax category of a language. E. g noun, verb etc. �An NT is written as a single capital letter or as a name enclosed between <…. >, e. g A or <Noun>. �During grammatical analysis, a nonterminal symbol represents an instance of the category. thus, <Noun> represents a noun. �Productions : �A production also called a rewriting rule, is a rule of grammar. �A production has the form �A nonterminal symbol: : = String of Ts and NTs
Conti… �Each grammar G defines a language Lg. G contains an NT called the distinguished symbol or start NT of G. unless otherwise specified, we use the symbol S as the distinguished symbol of G. �A valid string α of Lg is obtained by using the following procedure �Let α=‘S’ �While α is not a string of terminal symbols � Select an NT appearing in α, say X � Replace X by a string appearing on the RHS of a production of X. �Grammar �Derivation �Reduction �Parse Tree
- The unforgettable history question and answer
- God of secrets name
- Toys 2 thai
- Yet another compiler compiler
- Cross compiler in compiler design
- Language and processors for requirement
- Functions of compiler
- Language processor
- Toy assembly language
- Programming massively parallel processors
- Reservation table in pipeline
- Interrupt handling in arm processors
- The history of cpu
- Handlers classification in parallel computing
- Digital camera processors
- Disadvantages of intel processor
- Embeded processors
- Embedded innovator winter 2010
- Comparison of word processors
- Distributed query processing
- Parallel processors from client to cloud