Chapter 3 ContextFree Grammars and Parsing 1 Parsing
- Slides: 39
Chapter 3 Context-Free Grammars and Parsing 1
Parsing: Syntax Analysis l l l 2 decides which part of the incoming token stream should be grouped together. the output of parsing is some representation of a parse tree. intermediate code generator transforms the parse tree into an intermediate language.
Comparisons between r. e. (regular expressions) and c. f. g. (context-free grammars) l l 3 r. e. describes token using F. A. to test a valid c. f. g. describes programming language constructs using P. F. A. to test a valid program (sentence)
Features of programming languages l 4 contents: - declarations - sequential statements - iterative statements - conditional statements
l 5 features: - declare/state recursively & repeatedly - hierarchical specification e. g. , compound statement -> expression -> id - nested structures - similarity
Description of programming languages l l 6 Syntax Diagrams (See Sec. 3. 5. 2) Context Free Grammars (CFG)
7
Contex Free Grammar (in BNF) exp addop term | term addop + | term mulop factor | factor mulop * factor ( exp ) | number 8
History - In 1956 BNF (Backus Naur Form) is used for description of natural language. - Algol uses BNF to describe its language. - The Syntactic Specification of Programming Languages - CFG ( a BNF description) 9
Capabilities of Context-free grammars l l 10 give precise syntactic specification of programming languages a parser can be constructed automatically by CFG the syntax entity specified in CFG can be used for translating into object code. useful for describing nested structures such as balanced parentheses, matching beginend's, corresponding if-then-else, etc.
Def. of context free grammars - A CFG is a 4 -tuple (V, T, P, S), where V - a finite set of variables (non-terminals) T - a finite set of terminal symbols (tokens) P - a finite set of productions (or grammar rules) S - a start symbol and V T= S V Productions are of the form: A -> , where A V, (V+T)* 11 - CFG generates CFL(Context Free Languages)
Rules from F. A. (r. e. ) to CFG 1. 2. 3. 4. 5. 13 For each state there is a nonterminal symbol. If state A has a transition to state B on symbol a, introduce A -> a. B. If A goes to B on input , introduce A -> B. If A is an accepting state, introduce A -> . Make the start state of the NFA be the start symbol of the grammar.
Examples (1) r. e. : (a|b)(a|b|0|1)* c. f. g. : S -> a. A|b. A A -> a. A|b. A|0 A|1 A| (2) r. e. : (a|b)*abb c. f. g. : S -> a. S | b. S | a. A A -> b. B B -> b. C C -> 14
Why don’t we use c. f. g. to replace r. e. ? l l l 15 r. e. => easy & clear description for token. r. e. => efficient token recognizer modularizing the components
Derivations (How does a CFG defines a language? ) 16 Definitions: l directly derive * (V+T)* => l derive in zero or more steps + l derive in one or more steps => (V+T)* i l derive in i steps A => (V+T)* + l sentential form (V+T)* l sentence T* + l language: { w | S => w , w T* } l leftmost derivations l rightmost derivations
G = ( {exp, op}, {+, *, (, ), number}, P, exp ) P : { exp op exp | ( exp ) | number op + | - | * } (number-number)*number 17
18
Parse trees => a graphical representation for derivations. (Note the difference between parse tree and syntax tree. ) => Often the parse tree is produced in only a figurative sense; in reality, the parse tree exists only as a sequence of actions made by stepping through the tree construction process. 19
Ambiguity Ambiguous Grammars - Def. : A context-free grammar that can produce more than one parse tree for some sentence. - The ways to disambiguate a grammar: (1) specifying the intention (e. g. associtivity and precedence for arithmetic operators, other) (2) rewrite a grammar to incorporate the intention into the grammar itself. 20
For (1) Precedence: negate > exponent ( ) > * / > + - Associtivity: exponent ==> right associtivity others ==> left associtivity In yacc, a “specification rule” is used to solve the problem of (1), e. g. , the alignment order, the special syntax, default value (refer to yacc manual for the disambiguating rules) For (2) 1. introducing one nonterminal for each precedence level. 21
Example 1 E -> E + E | E-E | E * E | E / E | E E | ( E ) | - E | id is ambiguous ( associtivity. ) 22 is exponent operator with right
E E E * E E E + E id id id E + E id * id More than one parse tree for the sentence id + id * id 23
* + + * id id id More than one syntax tree for the sentence id + id * id 24
l The corresponding grammar shown below is unambiguous element -> (expression) | id /*((expression) 括號內的最 優先做之故) */ primary -> -primary | element factor -> primary factor | primary /*has right associtivity */ 25 term -> term * factor | term / factor | factor expression -> expression + term | expression – term | term
expression Ex: id + id * id expression + term factor primary element 26 id id factor * primary element id
27
28
Example 2 l stat -> IF cond THEN stat | IF cond THEN stat ELSE stat | other stat is an ambiguous grammar 29
Dangling else problem IF stat cond THEN stat IF cond THEN stat ELSE stat if c 1 If c 1 then if c 2 then s 2 else s 3 stat IF cond THEN stat ELSE stat if 30 c 1 then IF cond THEN stat if c 2 then s 2 else s 3
The corresponding grammar shown below is unambiguous. stat -> matched-stat | unmatched-stat -> if cond then matched-stat else matchedstat | other-stat unmatched-stat -> if cond then stat | if cond then matched-stat else unmatched-stat 31
32
Non-context free language constructs l l l 33 L = {wcw | w is in (a|b)*} L = {anbmcndm | n 1 and m 1} L = {anbncn| n 0}
Basic Parsing Techniques 1. How to check if an input string is a sentence of a given grammar? (check the syntax -- not only used in the programming language) 2. How to construct a parse tree for the input string, if desired? 34
Method classic approach 1. top-down recursive descent modern approach LL parsing (produce leftmost derivation) 2. bottom-up operator precedence LR parsing (shiftreduce parsing; produce rightmost derivation in reverse order) 35
An Example (for LR Parsing) S -> a. ABe A -> Abc | b w = abbcde rm rm rm B -> d rm S => a. ABe => a. Ade => a. Abcde => abbcde LR parsing: abbcde ==> a. Ade ==> a. ABe ==> S 36
37
38
Assignment #3 a 1. Do exercises 3. 3, 3. 5, 3. 24, 3. 25 Using the grammar in BNF of the TINY language in Fig. 3. 6 to derive step by step the sequence of tokens of the program in Fig. 3. 8. (for practice only) 39
- Example of unrestricted grammar
- Handling questions in context-free grammars
- Regular grammar generates regular language
- Useless symbols
- Unrestricted grammar
- Steps of query processing
- In panic mode recovery of ll(1) parsing ___________
- Semantic parsing
- Recursive descent parser
- Top down parsing in nlp
- Ll(1) parser solved example
- Parsing syntax
- Panic mode error recovery in predictive parsing
- Gj6 parsing
- Predictive parsing
- Advantages of bottom up parsing
- Yang memeriksa sintaks dan memeriksa relasi adalah
- Parsing adalah
- Probabilistic parsing
- Yichao zhou
- Morphological parsing in nlp
- Visual studio regular expression
- Parsing adalah
- Scanset in c
- Parsing adalah
- Non recursive predictive parsing
- Teknik parsing logika informatika
- Soa-ll1
- Parsing algorithms in nlp
- Cfg adalah
- Greenfoot reached end of file while parsing
- Top down parsing vs bottom up
- Predictive parsing
- Lr(0) parsing table
- Semantic parsing
- Predictive parsing
- Dfa
- Predictive parsing
- Parsing
- Left recursion