LR Parsing The Items Lecture 10 Mon Feb

LR Parsing – The Items Lecture 10 Mon, Feb 14, 2005

LR Parsers l l A bottom-up parser follows a rightmost derivation from the bottom up. Such parsers typically use the LR algorithm and are called LR parsers. l l L means process tokens from Left to right. R means follow a Rightmost derivation.

LR Parsers l l Furthermore, in LR parsing, the production is applied only after the pattern has been matched. In LL (predictive) parsing, the production was selected, and then the tokens were matched to it.

Rightmost Derivations l Let the grammar be E E+T|T T T*F|F F (E) | id | num

Rightmost Derivations l A rightmost derivation of (id + num)*id is E T T*F T*id F*id (E)*id (E + T)*id (E + F)*id (E + num)*id (T + num)*id (F + num)*id (id + num)*id.

LR Parsers l l An LR parser uses a parse table, an input buffer, and a stack of “states. ” It performs three operations. l l l Shift a token from the input buffer to the stack. Reduce the content of the stack by applying a production. Go to a new state.

LR(0) Items l l l To build an LR parse table, we must first find the LR(0) items. An LR(0) item is a production with a special marker ( ) marking a position within the string on the right side of the production. LR(0) parsing is also called SLR parsing (“simple” LR).

Example: LR(0) Items l If the production is E E + T, then the possible LR(0) items are l l [E E + T] [E E + T ]

LR(0) Items l l The interpretation of [A ] is “We have processed and we might process next. ” Whether we do actually process will be borne out by the subsequent tokens.

LR Parsing l l l We will build a PDA whose states are sets of LR(0) items. First we augment the grammar with a new start symbol S'. S' S. This guarantees that the start symbol will not recurse.

States of the PDA l l l The initial state is called I 0 (item 0). State I 0 is the closure of the set {[S' S]}. To form the closure of a set of items l l l For each item [A B ] in the set and for each production B in the grammar, add the item [B ] to the set. Let us call [B ] an initial B-item. Continue in this manner until there is no further change.

Example: LR Parsing l Continuing with our standard example, the augmented grammar is E' E E E+T|T T T*F|F F (E) | id | num

Example: LR Parsing l The state I 0 consists of the items in the closure of item [E' E] [E E + T] [E T] [T T * F] [T F] [F (E)] [F id] [F num]

Transitions l l There will be a transition from one state to another state for each grammar symbol in an item that immediately follows the marker in an item in that state. If an item in the state is [A X ], then l l The transition from that state occurs when the symbol X is processed. The transition is to the state that is the closure of the item [A X ].

Example: LR Parsing l l Thus, from the state I 0, there will be transitions for the symbols E, T, F, (, id, and num. For example, on processing E, the items [E' E] and [E E + T] become [E' E ] and [E E + T].

Example: LR Parsing l l Let state I 1 be the closure of these items. I 1: [E' E ] [E E + T] Thus the PDA has the transition I 0 E I 1

Example: LR Parsing l l l Similarly we determine the other transitions from I 0. Process T: I 2: [E T ] [T T * F] Process F: I 3: [T F ]

Example: LR Parsing l Process (: I 4: [F ( E)] [E E + T] [E T] [T T + F] [T F] [F (E)] [F id] [F num]

Example: LR Parsing l l Process id: I 5: [F id ] Process num: I 6: [F num ]

Example: LR Parsing l Now find the transitions from states I 1 through I 6 to other states, and so on, until no new states appear.

Example: LR Parsing l I 7: [E E + T] [T T * F] [T F] [F (E)] [F id] [F num]

Example: LR Parsing [T T * F] [F (E)] [F id] [F num] I 9: [F (E )] [E E + T] l I 8: l

Example: LR Parsing E + T ] [T T * F] I 11: [T T * F ] I 12: [F (E) ] l I 10: [E l l