BottomUp Parsing Algorithms n LRk parsing n n

Bottom-Up Parsing Algorithms n LR(k) parsing n n LR(0) n n zero tokens of look-ahead SLR n n n L: scan input Left to right R: produce Rightmost derivation k tokens of lookahead Simple LR: like LR(0), but uses FOLLOW sets to build more “precise” parsing tables LR(0) is a toy, so we focus on SLR Reading: Section 4. 7

Problem: when to shift, when to reduce? n Recall our favorite grammar: E T+E|T T int * T | int | (E) n n n The step T * int + int is not part of any rightmost derivation Hence, reducing first int to T was a mistake How to know when to reduce and when to shift?

What we need for LR parsing n LR(0) states n n n describe states in which the parser can be Note: LR(0) states are used by both LR(0) and SLR parsers Parsing tables n n transitions between LR(0) states, actions to take when transiting: n n shift, reduce, accept, error How to construct LR(0) states? How to construct parsing tables? How to drive the parser?

LR(0) state = set of LR(0) items n An LR(0) item [X a. b] says that n n the parser is looking for an X it has an a on top of the stack expects to find input string derived from b Notes: n [X a. ab] means that if a is on the input, it can be shifted (resulting in aa. b). That is: n n n a is a correct token to see on the input, and shifting a would not “over-shift” (still a viable prefix). [X a. ] means that we could reduce a to X

E T+. E 6 E . T LR(0) states T E T. S’ E. 2 E T . (E) T int. * T 1 int E. T T int E . T + E T . (E) T int *. T T . int * T T . (E) T . int * T ( T T int * T. 11 8 T (. E) E . T int 5 * T . int ( T . int T E . T + E E T . (E) T . int * T T . int T (E. ) 7 ) T (E). 10 E T + E. 9 ( T . int * T E T. + E T S’ . E E . T + E + 3 E ( 4

Naïve SLR Parsing Algorithm 1. Let M be LR(0) state machine for G • 2. 3. each state contains a set I of LR(0) items Let |x 1…xn$ be initial configuration Repeat until configuration is S’|$ • • Let a|w be current configuration Run M on current stack a If M rejects a, report parsing error If M accepts a, let a be next input n n Shift if [X b. a g ] Items Reduce if [X b. ] Items and a Follow(a). . . b | a. . . . | X a. . . n Report parsing error if neither applies

Notes n n If there is a conflict in the last step, grammar is not SLR(k) k is the amount of lookahead n In practice k = 1

E T+. E 6 E . T LR(0) states T E T. S’ E. 2 E T . (E) T int. * T 1 int E. T T int E . T + E T . (E) T int *. T T . int * T T . (E) T . int * T ( T T int * T. 11 8 T (. E) E . T int 5 * T . int ( T . int T E . T + E E T . (E) T . int * T T . int T (E. ) 7 ) T (E). 10 E T + E. 9 ( T . int * T E T. + E T S’ . E E . T + E + 3 E ( 4

SLR Example Configuration DFA Halt State | int * int $ 1 Action

Configuration | int * int $ T E T. S’ E. 2 E E . T T . (E) + 3 T . int * T int E. T T int E . T + E * T . (E) T int *. T T . int * T T . (E) T . int * T T . int ( ( T T int * T. 11 8 T (. E) E . T int 5 T E . T + E E T . (E) T . int * T T . int T (E. ) 7 ) T (E). 10 E T + E. 9 ( T . int T int. * T 1 int E E . T + E E T. + E T S’ . E E T+. E 6 ( 4

SLR Example Configuration DFA Halt State Action | int * int $ 1 shift int | * int $ 5

Configuration int | * int $ T E T. S’ E. 2 E E . T T . (E) + 3 T . int * T int E. T T int E . T + E * T . (E) T int *. T T . int * T T . (E) T . int * T T . int ( ( T T int * T. 11 8 T (. E) E . T int 5 T E . T + E E T . (E) T . int * T T . int T (E. ) 7 ) T (E). 10 E T + E. 9 ( T . int T int. * T 1 int E E . T + E E T. + E T S’ . E E T+. E 6 ( 4

Configuration int | * int $ T S’ E. 2 E E T. E . T T . (E) + 3 T . int * T int E. T T int E . T + E * T . (E) T int *. T T . int * T T . (E) T . int * T T . int ( ( T T int * T. 11 8 T (. E) E . T int 5 T E . T + E E T . (E) T . int * T T . int T (E. ) 7 ) T (E). 10 E T + E. 9 ( T . int T int. * T 1 int E E . T + E E T. + E T S’ . E E T+. E 6 ( 4

SLR Example Configuration DFA Halt State Action | int * int $ 1 shift int | * int $ 5 * not in Follow(T) shift int * | int $ 8

Configuration int * | int $ T E T. S’ E. 2 E E . T T . (E) + 3 T . int * T int E. T T int E . T + E * T . (E) T int *. T T . int * T T . (E) T . int * T T . int ( ( T T int * T. 11 8 T (. E) E . T int 5 T E . T + E E T . (E) T . int * T T . int T (E. ) 7 ) T (E). 10 E T + E. 9 ( T . int T int. * T 1 int E E . T + E E T. + E T S’ . E E T+. E 6 ( 4

Configuration int * | int $ T S’ E. 2 E E T. E . T T . (E) + 3 T . int * T int E. T T int E . T + E * T . (E) T int *. T T . int * T T . (E) T . int * T T . int ( ( T T int * T. 11 8 T (. E) E . T int 5 T E . T + E E T . (E) T . int * T T . int T (E. ) 7 ) T (E). 10 E T + E. 9 ( T . int T int. * T 1 int E E . T + E E T. + E T S’ . E E T+. E 6 ( 4

Configuration int * | int $ T E T. S’ E. 2 E E . T T . (E) + 3 T . int * T int E. T T int E . T + E * T . (E) T int *. T T . int * T T . (E) T . int * T T . int ( ( T T int * T. 11 8 T (. E) E . T int 5 T E . T + E E T . (E) T . int * T T . int T (E. ) 7 ) T (E). 10 E T + E. 9 ( T . int T int. * T 1 int E E . T + E E T. + E T S’ . E E T+. E 6 ( 4

SLR Example Configuration DFA Halt State Action | int * int $ 1 shift int | * int $ 5 * not in Follow(T) shift int * | int $ 8 shift int * int | $ 5

Configuration int * int | $ T E T. S’ E. 2 E E . T T . (E) + 3 T . int * T int E. T T int E . T + E * T . (E) T int *. T T . int * T T . (E) T . int * T T . int ( ( T T int * T. 11 8 T (. E) E . T int 5 T E . T + E E T . (E) T . int * T T . int T (E. ) 7 ) T (E). 10 E T + E. 9 ( T . int T int. * T 1 int E E . T + E E T. + E T S’ . E E T+. E 6 ( 4

Configuration int * | T $ T E T. S’ E. 2 E E . T T . (E) + 3 T . int * T int E. T T int E . T + E * T . (E) T int *. T T . int * T T . (E) T . int * T T . int ( ( T T int * T. 11 8 T (. E) E . T int 5 T E . T + E E T . (E) T . int * T T . int T (E. ) 7 ) T (E). 10 E T + E. 9 ( T . int T int. * T 1 int E E . T + E E T. + E T S’ . E E T+. E 6 ( 4

Configuration int * T | $ T E T. S’ E. 2 E E . T T . (E) + 3 T . int * T int E. T T int E . T + E * T . (E) T int *. T T . int * T T . (E) T . int * T T . int ( ( T T int * T. 11 8 T (. E) E . T int 5 T E . T + E E T . (E) T . int * T T . int T (E. ) 7 ) T (E). 10 E T + E. 9 ( T . int T int. * T 1 int E E . T + E E T. + E T S’ . E E T+. E 6 ( 4

Configuration |T$ T E T. S’ E. 2 E E . T T . (E) + 3 T . int * T int E. T T int E . T + E * T . (E) T int *. T T . int * T T . (E) T . int * T T . int ( ( T T int * T. 11 8 T (. E) E . T int 5 T E . T + E E T . (E) T . int * T T . int T (E. ) 7 ) T (E). 10 E T + E. 9 ( T . int T int. * T 1 int E E . T + E E T. + E T S’ . E E T+. E 6 ( 4

Configuration T|$ T S’ E. 2 E E T. E . T T . (E) + 3 T . int * T int E. T T int E . T + E * T . (E) T int *. T T . int * T T . (E) T . int * T T . int ( ( T T int * T. 11 8 T (. E) E . T int 5 T E . T + E E T . (E) T . int * T T . int T (E. ) 7 ) T (E). 10 E T + E. 9 ( T . int T int. * T 1 int E E . T + E E T. + E T S’ . E E T+. E 6 ( 4

Configuration |E$ T E T. S’ E. 2 E E . T T . (E) + 3 T . int * T int E. T T int E . T + E * T . (E) T int *. T T . int * T T . (E) T . int * T T . int ( ( T T int * T. 11 8 T (. E) E . T int 5 T E . T + E E T . (E) T . int * T T . int T (E. ) 7 ) T (E). 10 E T + E. 9 ( T . int T int. * T 1 int E E . T + E E T. + E T S’ . E E T+. E 6 ( 4

Configuration E|$ T E T. S’ E. 2 E E . T T . (E) + 3 T . int * T int E. T T int E . T + E * T . (E) T int *. T T . int * T T . (E) T . int * T T . int ( ( T T int * T. 11 8 T (. E) E . T int 5 T E . T + E E T . (E) T . int * T T . int T (E. ) 7 ) T (E). 10 E T + E. 9 ( T . int T int. * T 1 int E E . T + E E T. + E T S’ . E E T+. E 6 ( 4

Notes n Can also use one more state: n n n it accepts in state “S’ E $. ” i. e. , it accepts in configuration E$|, not in E|$. Rerunning the automaton at each step is wasteful n Most of the work is repeated

An Improvement n n Remember the state of the automaton on each prefix of the stack Change stack to contain pairs á DFA State , Symbol ñ

An Improvement (Cont. ) n For a stack á state 1, sym 1 ñ. . . á staten , symn ñ staten is the final state of the DFA on sym 1 … symn n Detail: bottom of stack is ástart, anyñ where n n any is any dummy input start is the start state of the DFA

Goto Table n n Define Goto[i, A] = j if statei A statej Goto is just the transition function of the DFA n One of two parsing tables

Refined Parser Moves n Shift x n n Reduce X a n n n Push áa, xñ on the stack a is current input x is a DFA state As before Accept Error

Action Table For each state si and terminal a n n If si has item X a. ab and Goto[i, a] = j then Action[i, a] = shift j If si has item X a. and a Follow(X) and X ¹ S’ then Action[i, a] = reduce X a n If si has item S’ S. then action[i, $] = accept n Otherwise, action[i, a] = error

SLR Parsing Algorithm Let I = w$ be initial input Let J = 1 Let DFA state 1 have item S’ . S Let stack = á 1 , dummy ñ repeat case action[top_state(stack), IJ] of shift k: push á k, IJ ñ, J++ reduce X A: pop |A| pairs, replace IJ-|A| to IJ-1 with X J = J - |A| accept: halt normally error: halt and report error

Notes on SLR Parsing Algorithm n Note that the algorithm uses only the DFA states and the input n n The stack symbols are never used! However, we still need the symbols for semantic actions

Constructing SLR states n LR(0) state machine n n n encodes all strings that are valid on the stack each valid string is a configuration, and hence corresponds to a state of the LR(0) state machine each state tells us what to do (shift or reduce? )

Example SLR Parse Table int 1 * + s 5 ( ) s 4 2 s 6 s 5 5 6 r 2 T s 2 s 3 r 2 s 4 s 8 r 4 s 5 r 4 s 7 s 3 s 9 s 3 r 4 s 4 7 8 E acc 3 4 $ s 10 s 5 s 4 9 s 11 r 1 10 r 5 r 5 11 r 3 r 3 1: 2: 3: 4: 5: E T+E E T T int * T T int T (E)

Example SLR Parse Stack I J Act <1, ? > int * int $ 1 s 5 <5, int><1, ? > 2 s 8 <8, *><5, int><1, ? > 3 s 5 <5, int><8, *><5, int><1, ? > 4 r 4 3 s 11 4 r 3 1 s 3 2 r 2 1 s 2 2 acc <8, *><5, int><1, ? > int * T $ <11, T> <8, *><5, int><1, ? > T$ <3, T><1, ? > <2, E><1, ? > E$

Another Example int * (int + int) * int $