More Recursive Descent Parsing Today RD evaluator for

  • Slides: 10
Download presentation
More Recursive Descent Parsing Today: RD evaluator for infix arithmetic expressions, e. g. ,

More Recursive Descent Parsing Today: RD evaluator for infix arithmetic expressions, e. g. , (5 + 3) - (2 + (4 – 7)) Recall from lecture Grammar consists of: Terminal symbols (“terminals”) Non-terminal symbols (“non-terminals”) Productions Start symbol (non-terminal)

Infix expression parsing grammar Productions Program → Expr EOF Expr → Term Rest. Expr

Infix expression parsing grammar Productions Program → Expr EOF Expr → Term Rest. Expr → + Expr | - Expr | null Term → n | ( Expr ) Which symbols are terminal? Which are not? Which should be the start symbol?

Writing an RD parser Write a method for each production Methods are mutually recursive

Writing an RD parser Write a method for each production Methods are mutually recursive Methods access tokens in two ways: get. Next. Token() Get the next token from the tokenizer put. Token. Back(token) Return the token to the tokenizer

Writing an RD parser How to write this in Java? For simplicity, the our

Writing an RD parser How to write this in Java? For simplicity, the our example uses characters as tokens Only handles single-digit numbers Evaluate an expression recursively Our production methods return integers Parsers often build syntax trees instead Methods return nodes in a tree

Writing an RD parser int program() { return expr(); } Program → Expr EOF

Writing an RD parser int program() { return expr(); } Program → Expr EOF Expr → Term Rest. Expr → + Expr | - Expr | null Term → n | ( Expr )

Writing an RD parser int program() { return expr(); } int expr() { return

Writing an RD parser int program() { return expr(); } int expr() { return rest. Expr(term()); } Program → Expr EOF Expr → Term Rest. Expr → + Expr | - Expr | null Term → n | ( Expr )

Writing an RD parser int program() { return expr(); } int expr() { return

Writing an RD parser int program() { return expr(); } int expr() { return rest. Expr(term()); } int rest. Expr(int trm) { char token = get. Next. Token(); if(token == '+') return trm + expr(); else if(token == '-') return trm - expr(); else if(token == '/') return trm / expr(); else if(token == '*') return trm * expr(); else { put. Back. Token(token); return trm; } } Program → Expr EOF Expr → Term Rest. Expr → + Expr | - Expr | null Term → n | ( Expr )

Writing an RD parser int term() { char token = get. Next. Token(); if(is.

Writing an RD parser int term() { char token = get. Next. Token(); if(is. Number(token)) return numeric. Value(token); else if(token == ‘(‘) { int subexpr = Expr(); assert get. Next. Token == ‘)‘; } else throw new Syntax. Error(); } Program → Expr EOF Expr → Term Rest. Expr → + Expr | - Expr | null Term → n | ( Expr )

Writing an RD parser Why can’t we just write Expr → Expr + Term?

Writing an RD parser Why can’t we just write Expr → Expr + Term? Program → Expr EOF Expr → Term Rest. Expr → + Expr | - Expr | null Term → n | ( Expr )

Writing an RD parser Why can’t we just write Expr → Expr + Term?

Writing an RD parser Why can’t we just write Expr → Expr + Term? RD parser must avoid left recursion; causes infinite loops Program → Expr EOF Expr → Term Rest. Expr → + Expr | - Expr | null Term → n | ( Expr )