Programming Language Syntax 4 http flic krpz Cy

  • Slides: 52
Download presentation
Programming Language Syntax 4 http: //flic. kr/p/z. Cy. Mp

Programming Language Syntax 4 http: //flic. kr/p/z. Cy. Mp

Recall input/output of parser …

Recall input/output of parser …

Recall… • CFG generates CF language • Parser recognizes language For any CFG, we

Recall… • CFG generates CF language • Parser recognizes language For any CFG, we can generate parser that runs in O(n 3), where n is length of input string

Classes of grammar that run in O(n) • LL – Left-to-right, Left-most derivation –

Classes of grammar that run in O(n) • LL – Left-to-right, Left-most derivation – Aka top-down, predictive – ANTLR accepts variant • LR – Left-to-right, Right-most derivation – Aka bottom-up – YACC accepts variant

id(A) , id(B) , id(C) ; Start with the start rule (id_list)

id(A) , id(B) , id(C) ; Start with the start rule (id_list)

id(A) , id(B) , id(C) ; Predict replacement for id_list

id(A) , id(B) , id(C) ; Predict replacement for id_list

id(A) , id(B) , id(C) ; ? Only one choice

id(A) , id(B) , id(C) ; ? Only one choice

id(A) , id(B) , id(C) ; Read token and match terminal What if terminal

id(A) , id(B) , id(C) ; Read token and match terminal What if terminal doesn’t match?

id(A) , id(B) , id(C) ; Predict replacement for id_list_tail

id(A) , id(B) , id(C) ; Predict replacement for id_list_tail

id(A) , id(B) , id(C) ; Can’t decide? Peek!

id(A) , id(B) , id(C) ; Can’t decide? Peek!

id(A) , id(B) , id(C) ; ?

id(A) , id(B) , id(C) ; ?

id(A) , id(B) , id(C) ; ? Read token, match terminal

id(A) , id(B) , id(C) ; ? Read token, match terminal

id(A) , id(B) , id(C) ; Read token, match terminal

id(A) , id(B) , id(C) ; Read token, match terminal

id(A) , id(B) , id(C) ; Predict replacement for id_list_tail

id(A) , id(B) , id(C) ; Predict replacement for id_list_tail

id(A) , id(B) , id(C) ; Can’t decide? Peek!

id(A) , id(B) , id(C) ; Can’t decide? Peek!

id(A) , id(B) , id(C) ; ?

id(A) , id(B) , id(C) ; ?

id(A) , id(B) , id(C) ; ? Read token, match terminal

id(A) , id(B) , id(C) ; ? Read token, match terminal

id(A) , id(B) , id(C) ; Read token, match terminal

id(A) , id(B) , id(C) ; Read token, match terminal

id(A) , id(B) , id(C) ; Predict replacement for id_list_tail

id(A) , id(B) , id(C) ; Predict replacement for id_list_tail

id(A) , id(B) , id(C) ; Can’t decide? Peek!

id(A) , id(B) , id(C) ; Can’t decide? Peek!

id(A) , id(B) , id(C) ;

id(A) , id(B) , id(C) ;

id(A) , id(B) , id(C) ; Read token, match terminal

id(A) , id(B) , id(C) ; Read token, match terminal

id(A) , id(B) , id(C) ; Done! No more non-terminals. No more input.

id(A) , id(B) , id(C) ; Done! No more non-terminals. No more input.

Classes of grammar that run in O(n) • LL – Left-to-right, Left-most derivation –

Classes of grammar that run in O(n) • LL – Left-to-right, Left-most derivation – Aka top-down, predictive – ANTLR uses LL(*) • LR – Left-to-right, Right-most derivation – Aka bottom-up – YACC uses variant

id(A) , id(B) , id(C) ;

id(A) , id(B) , id(C) ;

id(A) , id(B) , id(C) ; Shift: Read token, try to match right-hand side

id(A) , id(B) , id(C) ; Shift: Read token, try to match right-hand side No matches here!

id(A) , id(B) , id(C) ; Shift: Read token, try to match right-hand side

id(A) , id(B) , id(C) ; Shift: Read token, try to match right-hand side No matches here!

id(A) , id(B) , id(C) ; Shift: Read token, try to match right-hand side

id(A) , id(B) , id(C) ; Shift: Read token, try to match right-hand side No matches here!

id(A) , id(B) , id(C) ; Shift: Read token, try to match right-hand side

id(A) , id(B) , id(C) ; Shift: Read token, try to match right-hand side No matches here!

id(A) , id(B) , id(C) ; Shift: Read token, try to match right-hand side

id(A) , id(B) , id(C) ; Shift: Read token, try to match right-hand side No matches here!

id(A) , id(B) , id(C) ; Shift: Read token, try to match right-hand side

id(A) , id(B) , id(C) ; Shift: Read token, try to match right-hand side Finally, a match!

id(A) , id(B) , id(C) ; Reduce: Replace with id_list_tail Can we match again?

id(A) , id(B) , id(C) ; Reduce: Replace with id_list_tail Can we match again?

id(A) , id(B) , id(C) ; Yes!

id(A) , id(B) , id(C) ; Yes!

id(A) , id(B) , id(C) ; Reduce: Replace with id_list_tail Can we match again?

id(A) , id(B) , id(C) ; Reduce: Replace with id_list_tail Can we match again?

id(A) , id(B) , id(C) ; Yes!

id(A) , id(B) , id(C) ; Yes!

id(A) , id(B) , id(C) ; Reduce: Replace with id_list_tail Can we match again?

id(A) , id(B) , id(C) ; Reduce: Replace with id_list_tail Can we match again?

id(A) , id(B) , id(C) ; Yes!

id(A) , id(B) , id(C) ; Yes!

id(A) , id(B) , id(C) ; Reduce: Replace with id_list Done! We reached the

id(A) , id(B) , id(C) ; Reduce: Replace with id_list Done! We reached the start rule!

Was this grammar better for LL or LR? LL because LR needed to shift

Was this grammar better for LL or LR? LL because LR needed to shift the entire input before it could reduce (bad space performance)

Aside: What do these mean? • LL(1) – Up to 1 token of look-ahead

Aside: What do these mean? • LL(1) – Up to 1 token of look-ahead • LL(2) – Up to 2 tokens of look-ahead • LL(*) – Unlimited tokens of look-ahead (ANTLR 3)

Can you process grammar B with an LL(0) parser? Grammar B: Let’s try! Grammar

Can you process grammar B with an LL(0) parser? Grammar B: Let’s try! Grammar A:

id(A) , id(B) , id(C) ; id_list Can we predict the next replacement? Yes!

id(A) , id(B) , id(C) ; id_list Can we predict the next replacement? Yes! id_list_prefix ;

id(A) , id(B) , id(C) ; id_list_prefix ; Can we predict the next replacement?

id(A) , id(B) , id(C) ; id_list_prefix ; Can we predict the next replacement? Nope. Let’s peek.

id(A) , id(B) , id(C) ; id_list_prefix ; Can we predict the next replacement?

id(A) , id(B) , id(C) ; id_list_prefix ; Can we predict the next replacement? Hm. Still can’t. The problem is the left recursion in id_list_prefix—both its alternatives start with id!

Now let’s see how LR does…

Now let’s see how LR does…

id(A) , id(B) , id(C) ; id(A) First shift. Can we reduce? Yes! Right

id(A) , id(B) , id(C) ; id(A) First shift. Can we reduce? Yes! Right off the bat!

id(A) , id(B) , id(C) ; id_list_prefix id(A) Reduce!

id(A) , id(B) , id(C) ; id_list_prefix id(A) Reduce!

id(A) , id(B) , id(C) ; id_list_prefix , id(A) Next shift. Can we reduce?

id(A) , id(B) , id(C) ; id_list_prefix , id(A) Next shift. Can we reduce? Nope. Not yet.

id(A) , id(B) , id(C) ; id_list_prefix , id(B) id(A) Next shift. Can we

id(A) , id(B) , id(C) ; id_list_prefix , id(B) id(A) Next shift. Can we reduce? Yes!

id(A) , id(B) , id(C) ; id_list_prefix , id(B) id(A) Reduce! And so on…

id(A) , id(B) , id(C) ; id_list_prefix , id(B) id(A) Reduce! And so on…

Differences between LL and LR grammars LL likes this one: LR likes this one:

Differences between LL and LR grammars LL likes this one: LR likes this one: Recursion on the right Recursion on the left Since ANTLR is LL, we should favor right recursion (although ANTLR 4 allows direct left recursion)

What’s next? • Tuesday is a study day – Attendance optional – Prof Fleming

What’s next? • Tuesday is a study day – Attendance optional – Prof Fleming out of town • Exam 1 in one week (Thu)! • Homework 2 due in one week (Thu)!