Top Down Parsing l Recursive Descent Parsing l
- Slides: 15
Top Down Parsing l Recursive Descent Parsing l Top-down parsing: – Build tree from root symbol – Each production corresponds to one recursive procedure – Each procedure recognizes an instance of a non-terminal, returns tree fragment for the non-terminal 8 January 2004 Department of Software & Media Technology 1
General model Each right-hand side of a production provides body for a function l Each non-terminal on the right hand side is translated into a call to the function that recognizes that non-terminal l Each terminal in the right hand side is translated into a call to the lexical scanner. If the resulting token is not the expected terminal error occurs. l Each recognizing function returns a tree fragment. l 8 January 2004 Department of Software & Media Technology 2
Example: parsing a declaration FULL_TYPE_DECLARATION : : = l type DEFINING_IDENTIFIER is TYPE_DEFINITION; l Translates into: – get token type – Find a defining_identifier -- function call – get token is – Recognize a type_definition -- function call – get token semicolon l In practice, we already know that the first token is type, that’s why this routine was called in the first place! Predictive parsing is guided by the next token l 8 January 2004 Department of Software & Media Technology 3
Example: parsing a loop l FOR_STATEMENT : : = ITERATION_SCHEME loop STATEMENTS end loop; Node 1 : = find_iteration_scheme; -- call function get token loop List 1 : = Sequence of statements -- call function get token end get token loop get token semicolon; Result : = build loop_node with Node 1 and List 1 return Result 8 January 2004 Department of Software & Media Technology 4
Problem: l If there are multiple productions for a non-terminal, mechanism is required to determine which production to use: IF_STAT : : = if COND then Stats end if; IF_STAT : : = if COND then Stats ELSIF_PART end if; When next token is if, so which production to use 8 January 2004 ? Department of Software & Media Technology 5
One Solution: factorize grammar l If several productions have the same prefix, rewrite as single production: l IF_STAT : : = if COND then STATS [ELSIF_PART] end if; – Problem now reduces to recognizing whether an optional – Component (ELSIF_PART) is present 8 January 2004 Department of Software & Media Technology 6
Second Problem of Recursion Grammar should not be left-recursive: l E : : = E + T | T l Problem: to find an E, start by finding an E… l – Original scheme leads to infinite loop – Grammar is inappropriate for recursive-descent 8 January 2004 Department of Software & Media Technology 7
Solution to left-recursion l E : : = E + T | T means that eventually E expands into T + T …. l Rewrite as: – E : : = TE’ – E’ : : = + TE’ | epsilon l Informally: E’ is a possibly empty sequence of terms separated by an operator 8 January 2004 Department of Software & Media Technology 8
Recursion can involve multiple productions A : : = B C | D l B : : = A E | F l – Can be rewritten as: A : : = A E C | F C | D – Now apply previous method – General algorithm to detect and remove left-recursion 8 January 2004 Department of Software & Media Technology 9
Further Problem l Transformation does not preserve associativity: – – E : : = E + T | T Parses a + b + c as (a + b) + c E : : = TE’, E’ : : = + TE’ | epsilon Parses a + b +c as a + (b + c) – Incorrect for a - b – c : must rewrite tree 8 January 2004 Department of Software & Media Technology 10
In practice: use loop to find sequence of terms Node 1 : = P_Term; -- call function that recognizes a term loop exit when Token not in Token_Class_Binary_Addop; Node 2 : = New_Node (P_Binary_Adding_Operator); Scan; -- past operator Set_Left_Opnd (Node 2, Node 1); Set_Right_Opnd (Node 2, P_Term); -- find next term Set_Op_Name (Node 2); Node 1 : = Node 2; -- operand for next operation end loop; 8 January 2004 Department of Software & Media Technology 11
LL (1) Parsing LL (1) grammars l l l If table construction is successful, grammar is LL (1): left-to right, leftmost derivation with one-token lookahead. If construction fails, can conceive of LL (2), etc. Ambiguous grammars are never LL (k) If a terminal is in First for two different productions of A, the grammar cannot be LL (1). Grammars with left-recursion are never LL (k) Some useful constructs are not LL (k) 8 January 2004 Department of Software & Media Technology 12
Building LL (1) parse tables Table indexed by non-terminal and token. Table entry is a production: for each production P: A a loop for each terminal a in First (a) loop T (A, a) : = P; end loop; if e in First (a), then for each terminal b in Follow (a) loop T (A, b) : = P; end loop; end if; end loop; l All other entries are errors. l If two assignments conflict, parse table cannot be built. 8 January 2004 Department of Software & Media Technology 13
Left Recursion Removal & Left Factoring Left Recursion Removal: Left Factoring: 8 January 2004 Department of Software & Media Technology 14
Synatx Tree Construction in LL(1) First and Follow Sets LL(k) Parsers (Extending the Lookahead Error Recovery in Top Down Parsers Error Recovery in LL(1) Parsers 8 January 2004 Department of Software & Media Technology 15
- Which of the following is top down parser?
- Top down parsing
- Recursive descent parser
- Top down parsing algorithm
- Top down parsing vs bottom up
- Left recursion removal
- Parsing in nlp
- Left recursion and left factoring
- Advantages of bottom up parsing
- Panic mode error recovery in predictive parsing
- For top down parsing left recursion removal is
- Recursive descent parser
- Nltk recursive descent parser
- Recursive descent parser java
- Recursive descent parser
- Limitations of recursive descent parser