Lecture 4 LL Parsing CS 540 George Mason
- Slides: 69
Lecture 4: LL Parsing CS 540 George Mason University
Parsing Source language Scanner (lexical analysis) tokens Parser (syntax analysis) Syntatic structure Syntatic/semantic structure Semantic Analysis (IC generator) Code Generator Target language Code Optimizer • Syntax described formally • Tokens organized into syntax tree that describes structure • Error checking Symbol Table CS 540 Spring 2010 GMU 2
Top Down (LL) Parsing P P begin SS end SS S ; SS SS e S simplestmt S begin SS end begin simplestmt ; simplestmt CS 540 Spring 2010 GMU ; end 3
Top Down (LL) Parsing P P begin SS end SS S ; SS SS e S simplestmt S begin SS end begin SS simplestmt ; simplestmt CS 540 Spring 2010 GMU ; end 4
Top Down (LL) Parsing P P begin SS end SS S ; SS SS e S simplestmt S begin SS end SS SS S begin simplestmt ; simplestmt CS 540 Spring 2010 GMU ; end 5
Top Down (LL) Parsing P P begin SS end SS S ; SS SS e S simplestmt S begin SS end SS SS S begin simplestmt ; simplestmt CS 540 Spring 2010 GMU ; end 6
Top Down (LL) Parsing P P begin SS end SS S ; SS SS e S simplestmt S begin SS end SS SS S begin SS S simplestmt ; simplestmt CS 540 Spring 2010 GMU ; end 7
Top Down (LL) Parsing P P begin SS end SS S ; SS SS e S simplestmt S begin SS end SS SS S begin SS S simplestmt ; simplestmt CS 540 Spring 2010 GMU ; end 8
Top Down (LL) Parsing P begin SS end SS S ; SS SS e S simplestmt S begin SS end 1 P SS 2 SS S 3 4 SS 6 5 S P begin SS end begin S ; SS end begin simplestmt ; SS end begin simplestmt ; end e begin simplestmt ; end CS 540 Spring 2010 GMU 9
Grammar S a. B | b. C B bb. C C cc S b S C c Two strings in the language: abbcc and bcc Can choose between them based on the first character of the input. CS 540 Spring 2010 GMU a c B b b C c c 10
LL(k) parsing also known as the lookahead • Process input k symbols at a time. • Initially, ‘current’ non-terminal is start symbol. • Algorithm – Loop until no more input • Given next k input tokens and ‘current’ non-terminal T, choose a rule R (T …) • For each element X in rule R from left to right, if X is a non-terminal, we will need to ‘expand’ X else if symbol X is a terminal, see if next input symbol matches X; if so, update from the input • Typically, we consider LL(1) CS 540 Spring 2010 GMU 11
Two Approaches • Recursive Descent parsing – Code tailored to the grammar • Table Driven – predictive parsing – Table tailored to the grammar – General Algorithm Both algorithms driven by the tokens coming from the lexer. CS 540 Spring 2010 GMU 12
Writing a Recursive Descent Parser • Generate a procedure for each non-terminal. Use next token from yylex() (lookahead) to choose (PREDICT) which production to ‘mimic’. – for non-terminal X, call procedure X() – for terminals X, call ‘match(X)’ Ex: B b C D B() { if (lookahead == ‘b’) { match(‘b’); C(); D(); } else … } CS 540 Spring 2010 GMU 13
Writing a Recursive Descent Parser Also need the following: match(symbol) { if (symbol == lookahead) lookahead = yylex() else error() } main() { lookahead = yylex(); S(); /* S is the start symbol */ if (lookahead == EOF) then accept else reject } error() { … } CS 540 Spring 2010 GMU 14
Back to grammar S() { if (lookahead == a ) { match(a); B(); } else if (lookahead == b) { match(b); C(); } else error(“expecting a or b”); S a. B S b. C } B() { if (lookahead == b) {match(b); C(); } else error(); B bb. C } C() { if (lookahead == c) { match(c) ; } else error(); C cc } CS 540 Spring 2010 GMU 15
Parsing abbcc S Remaining input: abbcc Call S() from main() S() { if (lookahead == a ) { match(a); B(); } else if (lookahead == b) { match(b); C(); } else error(“expecting a or b”); } CS 540 Spring 2010 GMU S a. B S b. C 16
Parsing abbcc S a Remaining input: bbcc B Call B() from A(): B() { if (lookahead == b) {match(b); C(); } else error(); } CS 540 Spring 2010 GMU B bb. C 17
Parsing abbcc S a b Remaining input: cc B b Call C() from B(): C C() { if (lookahead == c) { match(c) ; } else error(); } CS 540 Spring 2010 GMU C cc 18
Parsing abbcc S a b Remaining input: B b C c c CS 540 Spring 2010 GMU 19
How do we find the lookaheads? • Can compute PREDICT sets from FIRST and FOLLOW for LL(1) parsing: • PREDICT(A a) = (FIRST(a) – {e}) FOLLOW(A) = FIRST(a) if e in FIRST(a) if e not in FIRST(a) NOTE: e never in PREDICT sets For LL(k) grammars, the PREDICT sets for the productions associated with a given non-terminal must be disjoint. CS 540 Spring 2010 GMU 20
Example Production Predict FIRST(F) = {(, id} FIRST(T) = {(, id} E T E’ = FIRST(T) = {(, id} FIRST(E) = {(, id} E’ + T E’ {+} FIRST(T’) = {*, e} E’ e = FOLLOW(E’) = {$, )} FIRST(E’) = {+, e} T F T’ = FIRST(F) = {(, id} FOLLOW(E) = {$, )} T’ * F T’ {*} FOLLOW(E’) = {$, )} T’ e = FOLLOW(T’) = {+, $, )} FOLLOW(T) = {+$, )} F id {id} FOLLOW(T’) = {+, $, )} F (E) {(} FOLLOW(F) = {*, +, $, )} Assume E is the start symbol CS 540 Spring 2010 GMU 21
E() { if (lookahead in {(, id } ) { T(); E_prime(); } else error(“E expecting ( or identifier”); } E T E’ E_prime() { if (lookahead in {+}) {match(+); T(); E_prime(); } E’ + T E’ else if (lookahead in {), end_of_file}) return; E’ e else error(“E_prime expecting +, ) or end of file”); } T() { if (lookahead in {(, id}) { F(); T_prime(); } else error(“T expecting ( or identifier”); } CS 540 Spring 2010 GMU T F T’ 22
T_prime() { if (lookahead in {*}) {match(*); F(); T_prime(); } T’ * F T’ else if (lookahead in {+, ), end_of_file}) return; T’ e else error(“T_prime expecting *, ) or end of file”); } F() { if (lookahead in {id}) match(id); F id else if (lookahead in {(} ) { match( ( ); E(); match ( ) ); } F ( E ) else error(“F expecting ( or identifier”); } CS 540 Spring 2010 GMU 23
Parsing a + b * c E Remaining input: CS 540 Spring 2010 GMU a+b*c 24
Parsing a + b * c E T Remaining input: a+b*c E’ E() { if (lookahead in {(, id } ) { T(); E_prime(); } else error(“E expecting ( or identifier”); } CS 540 Spring 2010 GMU 25
Parsing a + b * c E T F Remaining input: a+b*c E’ T’ T() { if (lookahead in {(, id } ) { F(); T_prime(); } else error(“T expecting ( or identifier”); } CS 540 Spring 2010 GMU 26
Parsing a + b * c E T F id a Remaining input: +b*c E’ T’ F() { if (lookahead in {id } ) match(id) else if (lookahead in { ( } { match( ( ); E(); match( ) ); } else error(“F expecting ( or identifier”); } CS 540 Spring 2010 GMU 27
Parsing a + b * c E T F id a +b*c E’ T’ e Remaining input: T_prime() { if (lookahead in {*}) {match(*); F(); T_prime(); }’ else if (lookahead in {+, ), end_of_file}) return; else error(“T_prime expecting *, ) or end of file”); } } CS 540 Spring 2010 GMU 28
Parsing a + b * c E T F id a Remaining input: E’ T’ + e b*c T E’ E_prime() { if (lookahead in {+}) {match(+); T(); E_prime(); }’ else if (lookahead in {), end_of_file}) return; else error(“E_prime expecting *, ) or end of file”); } } CS 540 Spring 2010 GMU 29
Parsing a + b * c E T F id a e Remaining input: b*c E’ T’ + T F T’ E’ T() { if (lookahead in {(, id } ) { F(); T_prime(); } else error(“T expecting ( or identifier”); } CS 540 Spring 2010 GMU 30
Parsing a + b * c E T F id a Remaining input: *c E’ T’ + T F T’ e id b E’ F() { if (lookahead in {id } ) match(id) else if (lookahead in { ( } { match( ( ); E(); match( ) ); } else error(“F expecting ( or identifier”); } CS 540 Spring 2010 GMU 31
Parsing a + b * c E T F id a Remaining input: E’ T’ + e c F T E’ T’ id * F b T’ T_prime() { if (lookahead in {*}) {match(*); F(); T_prime(); }’ else if (lookahead in {+, ), end_of_file}) return; else error(“T_prime expecting *, ) or end of file”); } } CS 540 Spring 2010 GMU 32
Parsing a + b * c E T F id a Remaining input: E’ T’ + e F T E’ T’ id * F b id c T’ F() { if (lookahead in {id } ) match(id) else if (lookahead in { ( } { match( ( ); E(); match( ) ); } else error(“F expecting ( or identifier”); } CS 540 Spring 2010 GMU 33
Parsing a + b * c E T F id a Remaining input: E’ T’ + e F T E’ T’ id * F b id c T’ e T_prime() { if (lookahead in {*}) {match(*); F(); T_prime(); }’ else if (lookahead in {+, ), end_of_file}) return; else error(“T_prime expecting *, ) or end of file”); } } CS 540 Spring 2010 GMU 34
Parsing a + b * c E T F id a e Remaining input: E’ T’ + T E’ F T’ e id * F b id c T’ e E_prime() { if (lookahead in {+}) {match(+); T(); E_prime(); }’ else if (lookahead in {), end_of_file}) return; else error(“E_prime expecting *, ) or end of file”); } } CS 540 Spring 2010 GMU 35
Stacks in Recursive Descent Parsing E E’ T F • Runtime stack • Procedure activations correspond to a path in parse tree from root to some interior node id b CS 540 Spring 2010 GMU 36
Two Approaches • Recursive Descent parsing – Code tailored to the grammar • Table Driven – predictive parsing – Table tailored to the grammar – General Algorithm Both algorithms driven by the tokens coming from the lexer. CS 540 Spring 2010 GMU 37
LL(1) Predictive Parse Tables An LL(1) Parse table is a mapping T: Vn x Vt production P or error 1. For all productions A a do For each terminal t in Predict(A a), T[A][t] = A a 2. Every undefined table entry is an error. CS 540 Spring 2010 GMU 38
Using LL(1) Parse Tables ALGORITHM INPUT: token sequence to be parsed, followed by ‘$’ (end of file) DATA STRUCTURES: • Parse stack: Initialized by pushing ‘$’ and then pushing the start symbol • Parse table T CS 540 Spring 2010 GMU 39
Algorithm: Predictive Parsing push($); push(start_symbol); similar lookahead = yylex() repeat X = pop(stack) if X is a terminal symbol or $ then if X = lookahead then lookahead = yylex() else error() else /* X is non-terminal */ if T[X][lookahead] = X Y 1 Y 2 …Ym push(Ym) … push (Y 1) else error() until X = $ token to ‘match’ similar to ‘mimic’ CS 540 Spring 2010 GMU 40
Example NT/T + E E’ T * ( ) ID $ T’ F CS 540 Spring 2010 GMU Production Predict 1: E T E’ {(, id} 2: E’ + T E’ {+} 3: E’ e {$, )} 4: T F T’ {(, id} 5: T’ * F T’ {*} 6: T’ e {+, $, )} 7: F id {id} 8: F ( E ) {(} 41
NT/T + E 2 E’ T T’ F 6 * ( ) 1 ID 1 3 4 5 3 4 6 8 $ 6 7 CS 540 Spring 2010 GMU Production Predict 1: E T E’ {(, id} 2: E’ + T E’ {+} 3: E’ e {$, )} 4: T F T’ {(, id} 5: T’ * F T’ {*} 6: T’ e {+, $, )} 7: F id {id} 8: F ( E ) {(} 42
Stack $E NT/T + E 2 E’ T T’ F 6 * ( ) 1 ID 4 $ 3 4 6 8 a+b*c$ E T E’ 1 3 5 Input Action 6 7 Assume E is the start symbol CS 540 Spring 2010 GMU 43
NT/T + E 2 E’ T T’ F 6 * ( ) 1 ID 4 a+b*c$ E T E’ a+b*c$ T F T’ 3 4 6 8 Input Action 1 3 5 $ Stack $E $E’T 6 7 CS 540 Spring 2010 GMU 44
NT/T + E 2 E’ T T’ F 6 * ( ) 1 ID 1 3 4 5 Input Action a+b*c$ E T E’ a+b*c$ T F T’ a+b*c$ F id 3 4 6 8 $ Stack $E $E’T’F 6 7 CS 540 Spring 2010 GMU 45
NT/T + E 2 E’ T T’ F 6 * ( ) 1 ID 1 3 4 5 3 Input Action a+b*c$ E T E’ a+b*c$ T F T’ a+b*c$ F id a+b*c$ match 4 6 8 $ Stack $E $E’T’F $E’T’id 6 7 CS 540 Spring 2010 GMU 46
NT/T + E 2 E’ T T’ F 6 * ( ) 1 ID 1 3 4 5 3 4 6 8 $ Stack $E $E’T’F $E’T’id $E’T’ Input Action a+b*c$ E T E’ a+b*c$ T F T’ a+b*c$ F id a+b*c$ match +b*c$ T’ e 6 7 CS 540 Spring 2010 GMU 47
NT/T + E 2 E’ T T’ F 6 * ( ) 1 ID 1 3 4 5 3 4 6 8 $ 6 Stack $E $E’T’F $E’T’id $E’T’ $E’ Input Action a+b*c$ E T E’ a+b*c$ T F T’ a+b*c$ F id a+b*c$ match +b*c$ T’ e +b*c$ E’ + T E’ 7 CS 540 Spring 2010 GMU 48
NT/T + E 2 E’ T T’ F 6 * ( ) 1 ID 1 3 4 5 3 4 6 8 $ 6 7 Stack $E $E’T’F $E’T’id $E’T’ $E’ T + CS 540 Spring 2010 GMU Input Action a+b*c$ E T E’ a+b*c$ T F T’ a+b*c$ F id a+b*c$ match +b*c$ T’ e +b*c$ E’ + T E’ +b*c$ match 49
NT/T + E 2 E’ T T’ F 6 * ( ) 1 ID 1 3 4 5 3 4 6 8 $ 6 7 Stack $E $E’T’F $E’T’id $E’T’ $E’ T + $E’ T CS 540 Spring 2010 GMU Input Action a+b*c$ E T E’ a+b*c$ T F T’ a+b*c$ F id a+b*c$ match +b*c$ T’ e +b*c$ E’ + T E’ +b*c$ match b*c$ T F T’ 50
NT/T + E 2 E’ T T’ F 6 * ( ) 1 ID 1 3 4 5 3 4 6 8 $ 6 7 Stack $E $E’T’F $E’T’id $E’T’ $E’ T + $E’ T $E’T’F CS 540 Spring 2010 GMU Input Action a+b*c$ E T E’ a+b*c$ T F T’ a+b*c$ F id a+b*c$ match +b*c$ T’ e +b*c$ E’ + T E’ +b*c$ match b*c$ T F T’ b*c$ F id 51
NT/T + E 2 E’ T T’ F 6 * ( ) 1 ID 1 3 4 5 3 4 6 8 $ 6 7 Stack $E $E’T’F $E’T’id $E’T’ $E’ T + $E’ T $E’T’F $E’T id CS 540 Spring 2010 GMU Input Action a+b*c$ E T E’ a+b*c$ T F T’ a+b*c$ F id a+b*c$ match +b*c$ T’ e +b*c$ E’ + T E’ +b*c$ match b*c$ T F T’ b*c$ F id b*c$ match 52
Parsing a + b * c Stack Input Action E T E’ $E’T’F F id $E’T T F T’ $E’T’id match $E’T’F F id $E’T’id match $E’T’F* T’ e $E’T’F $E’ E’ + T E’ $E’T’id $E’T+ match $E’T’ T F T’ $E’ E’ e $ accept $E $E’T’ $E’T a+b*c$ CS 540 Spring 2010 GMU *c$ T’ * F T’ match c$ F id match $ T’ e 53
Stacks in Predictive Parsing • Algorithm data structure • Hold terminals and non-terminals from the grammar – terminals – still need to be matched from the input – non-terminals – still need to be expanded CS 540 Spring 2010 GMU 54
Making a grammar LL(1) • Not all context free languages have LL(1) grammars • Can show a grammar is not LL(1) by looking at the predict sets – For LL(1) grammars, the PREDICT sets for a given non-terminal will be disjoint. CS 540 Spring 2010 GMU 55
Example Production Predict E E+T = FIRST(E) = {(, id} E T = FIRST(T) = {(, id} T T*F = FIRST(T) = {(, id} T F = FIRST(F) = {(, id} F id = {id} F (E) = {(} • FIRST(F) = {(, id} • FIRST(T) = {(, id} • FIRST(E) = {(, id} • FIRST(T’) = {*, e} • FIRST(E’) = {+, e} • FOLLOW(E) = {$, )} • FOLLOW(E’) = {$, )} • FOLLOW(T) = {+$, )} • FOLLOW(T’) = {+, $, )} • FOLLOW(F) = {*, +, $, )} Two problems: E and T CS 540 Spring 2010 GMU 56
Making a non-LL(1) grammar LL(1) • Eliminate common prefixes Ex: A B a C D | B a C E • Transform left recursion to right recursion Ex: E E + T | T CS 540 Spring 2010 GMU 57
Eliminate Common Prefixes • A ab|ad Can become: A a A’ A’ b | d Doesn’t always remove the problem. Why? CS 540 Spring 2010 GMU 58
Why is left recursion a problem? A a A A a CS 540 Spring 2010 GMU 59
Remove Left Recursion A A a 1 | A a 2 | … | b 1 | b 2 | … becomes A b 1 A’| b 2 A’| … A’ a 1 A’ | a 2 A’ | … | e The left recursion becomes right recursion CS 540 Spring 2010 GMU 60
A A a | b becomes A b B, B a B | e A A a A b a a A B B a A a B a b CS 540 Spring 2010 GMU B e 61
Expression Grammar • E E+T|T T T*F|F F id | ( E ) NOT LL(1) • Eliminate left recursion: E T E’, E’ + T E’ | e T F T’, T’ * F T’ | e F id | ( E ) CS 540 Spring 2010 GMU 62
E E + T | T becomes E T E’, E’ + T E’ | e E E +T +T T E’ e CS 540 Spring 2010 GMU 63
Non-Immediate Left Recursion • Ex: A 1 A 2 a | b A 2 A 1 c | A 2 d • Convert to immediate left recursion A 1 A 2 – Substitute A 1 in second set of productions by A 1’s definition: A 1 A 2 a | b A 2 a c | b c | A 2 d • Eliminate recursion: A 1 A 2 a | b A 2 b c A 3 a c A 3 | d A 3 | e CS 540 Spring 2010 GMU 64
Example • A Bc|d A B Cf|Bf C Ae|g C • Rewrite: replace C in B B Aef|gf|Bf • Rewrite: replace A in B B Bcef|def|gf|Bf CS 540 Spring 2010 GMU B 65
• Now grammar is: A Bc|d B Bcef|def|gf|Bf C Ae|g • Get rid of left recursion (and C if A is start) A Bc|d B d e f B’ | g f B’ B’ c e f B’ | e CS 540 Spring 2009 GMU 66
Error Recovery in LL parsing • Simple option: When see an error, print a message and halt • “Real” error recovery – Insert “expected” token and continue – can have a problem with termination – Deleting tokens – for an error for non-terminal F, keep deleting tokens until see a token in follow(F). CS 540 Spring 2010 GMU 67
For example: E() { if (lookahead in {(, id} ) { T(); E_prime(); } E T E’ else { printf(“E expecting ( or identifier”); Follow(E) = $ ) while (lookahead != ) or $) lookahead = yylex(); } } CS 540 Spring 2010 GMU 68
Real-World Compilers http: //cs. gmu. edu/~white/CS 540/parser. cpp // CParser: : Parse. Source. Module is the “main” CS 540 Spring 2010 GMU 69
- Hattie dorsett
- George mason orientation
- Dr kihn emilie
- George mason registration
- Gmu health informatics
- George mason average gpa
- George mason anti federalist
- Health informatics gmu
- George mason university english language institute
- George mason quotes anti federalist
- 01:640:244 lecture notes - lecture 15: plat, idah, farad
- Tiga lusin pensil dibeli dengan harga rp54.000
- Colorea los múltiplos
- 1 to 25 squares
- Norma 540
- Cis 540
- Seorang pedagang membeli 1 kodi mainan seharga rp280.000
- What is density in chemistry
- A 540 degree appraisal involves evaluation by
- Jika sq = 48 derajat maka besar prq adalah
- Cis 540
- 540 beaumont avenue montréal, quebec
- Cas audit standards
- Isa 540 summary
- Cpcu 540 formulas
- Lycoming io540
- Segi banyak beraturan
- Sbu/noforn
- Erratic
- Gawain 7 alamin mo
- George washington vs king george iii
- George washington and john adams venn diagram
- Parsing adalah
- Morphological parsing in nlp
- Parsing syntax
- Eliminating left recursion
- Top down parsing vs bottom up
- Semantic parsing
- Parsing adalah
- Steps of query processing
- Bnf grammar
- Cvpr paper list
- Recursive descent parsing
- Parsing
- Parsing algorithms in nlp
- Visual studio regular expression
- Error recovery in top down parser
- Consider the augmented grammar
- Predictive parsing
- Predictive parsing
- Non recursive predictive parsing
- Probabilistic parsing
- Advantages of bottom up parsing
- Image parsing
- Left corner parsing
- Top down parsing in nlp
- Left recursion
- Contoh parsing
- Cfg adalah
- The lexical analysis for a modern computer
- In panic mode recovery of ll(1) parsing ___________
- Which parser is powerful
- Lr(0) parsing table
- Teknik parsing logika informatika
- Parsing adalah
- Yichao zhou
- Ll1 grammar
- Top down parsing algorithm
- For top down parsing left recursion removal is
- Reached end of file while parsing greenfoot