LRGrammars LR0 LR1 and LRK Deterministic ContextFree Languages

  • Slides: 40
Download presentation
LR-Grammars LR(0), LR(1), and LR(K)

LR-Grammars LR(0), LR(1), and LR(K)

Deterministic Context-Free Languages n n n DCFL A family of languages that are accepted

Deterministic Context-Free Languages n n n DCFL A family of languages that are accepted by a Deterministic Pushdown Automaton (DPDA) Many programming languages can be described by means of DCFLs

Prefix and Proper Prefix n Prefix (of a string) n n Any number of

Prefix and Proper Prefix n Prefix (of a string) n n Any number of leading symbols of that string Example: abc n n Prefixes: , a, abc Proper Prefix (of a string) n n A prefix of a string, but not the string itself Example: abc n Proper prefixes: , a, ab

Prefix Property n n Context-Free Language (CFL) L is said to have the prefix

Prefix Property n n Context-Free Language (CFL) L is said to have the prefix property whenever w is in L and no proper prefix of w is in L Not considered a serve restriction n Why? n Because we can easily convert a DCFL to a DCFL with the prefix property by introducing an endmarker

Suffix and Proper Suffix n Suffix (of a string) n n Any number of

Suffix and Proper Suffix n Suffix (of a string) n n Any number of trailing symbols Proper Suffix n A suffix of a string, but not the string itself

Example Grammar n This is the grammar that will be used in many of

Example Grammar n This is the grammar that will be used in many of the examples: n n n S’ Sc S SA | A A a. Sb | ab

LR-Grammar n n Left-to-right scan of the input producing a rightmost derivation Simply: n

LR-Grammar n n Left-to-right scan of the input producing a rightmost derivation Simply: n n L stands for Left-to-right R stands for rightmost derivation

LR-Items n An item (for a given CFG) n n A production with a

LR-Items n An item (for a given CFG) n n A production with a dot anywhere in the right side (including the beginning and end) In the event of an -production: B n B · is an item

Example: Items n Given our example grammar: n n S’ Sc, S SA|A, A

Example: Items n Given our example grammar: n n S’ Sc, S SA|A, A a. Sb|ab The items for the grammar are: S’ ·Sc, S’ S·c, S’ Sc· S ·SA, S S·A, S SA·, S ·A, S A· A ·a. Sb, A a·Sb, A a. S·b, A a. Sb·, A ·ab, A a·b, A ab·

Some Notation n * = 1 or more steps in a derivation n *

Some Notation n * = 1 or more steps in a derivation n * rm = rightmost derivation n rm = single step in rightmost derivation

Right-Sentential Form n A sentential form that can be derived by a rightmost derivation

Right-Sentential Form n A sentential form that can be derived by a rightmost derivation n A string of terminals and variables is called a sentential form if S*

More terms n Handle n n A substring which matches the right-hand side of

More terms n Handle n n A substring which matches the right-hand side of a production and represents 1 step in the derivation Or more formally: n n (of a right-sentential form for CFG G) Is a substring such that: n n n S * rm w w = If the grammar is unambiguous: n n There are no useless symbols The rightmost derivation (in right-sentential form) and the handle are unique

Example n Given our example grammar: n n An example right-most derivation: n n

Example n Given our example grammar: n n An example right-most derivation: n n S’ Sc, S SA|A, A a. Sb|ab S’ Sc SAc Sa. Sbc Therefore we can say that: Sa. Sbc is in right-sentential form n The handle is a. Sb

More terms n Viable Prefix n n n (of a right-sentential form for )

More terms n Viable Prefix n n n (of a right-sentential form for ) Is any prefix of ending no farther right than the right end of a handle of . Complete item n An item where the dot is the rightmost symbol

Example n Given our example grammar: n n The right-sentential form abc: n n

Example n Given our example grammar: n n The right-sentential form abc: n n S’ * rm Ac abc Valid prefixes: A ab for prefix ab n A a b for prefix a n A ab for prefix A ab is a complete item, Ac is the right-sentential form for abc n n S’ Sc, S SA|A, A a. Sb|ab

LR(0) n n Left-to-right scan of the input producing a rightmost derivation with a

LR(0) n n Left-to-right scan of the input producing a rightmost derivation with a look-ahead (on the input) of 0 symbols It is a restricted type of CFG 1 st in the family of LR-grammars LR(0) grammars define exactly the DCFLs having the prefix property

Computing Sets of Valid Items n The definition of LR(0) and the method of

Computing Sets of Valid Items n The definition of LR(0) and the method of accepting L(G) for LR(0) grammar G by a DPDA depends on: n n Knowing the set of valid items for each prefix For every CFG G, the set of viable prefixes is a regular set n This regular set is accepted by an NFA whose states are the items for G

Continued n Given an NFA (whose states are the items for G) that accepts

Continued n Given an NFA (whose states are the items for G) that accepts the regular set n n We can apply the subset construction to this NFA and yield a DFA The DFA whose state is the set of valid items for

NFA M n NFA M recognizes the viable prefixes for CFG n M =

NFA M n NFA M recognizes the viable prefixes for CFG n M = (Q, V T, , q 0, Q) n n n Q = set of items for G plus state q 0 G = (V, T, P, S) Three Rules n n (q 0, ) = {S | S is a production} (A B , ) = {B | B is a production} n n Allows expansion of a variable B appearing immediately to the right of the dot (A X , X) = {A X } n Permits moving the dot over any grammar symbol X if X is the next input symbol

Theorem 10. 9 n n The NFA M has property that (q 0, )

Theorem 10. 9 n n The NFA M has property that (q 0, ) contains A iff A is valid for This theorem gives a method for computing the sets of valid items for any viable prefix n Note: It is an NFA. It can be converted to a DFA. Then by inspecting each state it can be determine if it is a valid LR(0) grammar

Definition of LR(0) Grammar n G is an LR(0) grammar if n n The

Definition of LR(0) Grammar n G is an LR(0) grammar if n n The start symbol does not appear on the right side of any productions prefixes of G where A is a complete item, then it is unique n i. e. , there are no other complete items (and there are no items with a terminal to the right of the dot) that are valid for

Facts we now know: n n Every LR(0) grammar generates a DCFL Every DCFL

Facts we now know: n n Every LR(0) grammar generates a DCFL Every DCFL with the prefix property has a LR(0) grammar Every language with LR(0) grammar have the prefix property L is DCFL iff L has a LR(0) grammar

DPDA’s from LR(0) Grammars n n We trace out the rightmost derivation in reverse

DPDA’s from LR(0) Grammars n n We trace out the rightmost derivation in reverse The stack holds a viable prefix (in rightsentential form) and the current state (of the DFA) n n n Viable prefixes: X 1 X 2…Xk States: s 1, s 2, …, sk Stack: s 0 X 1 s 1…Xksk

Reduction n If sk contains A n n n Then A is valid for

Reduction n If sk contains A n n n Then A is valid for X 1 X 2…Xk = suffix of X 1 X 2…Xk Let n n = Xi+1…Xk w such that X 1…Xkw is a right-sentential form.

Reduction Continued n There is a derivation: n n S * rm X 1…Xi.

Reduction Continued n There is a derivation: n n S * rm X 1…Xi. Aw rm X 1…Xkw To obtain the right-sentential form (X 1…Xkw) in a right derivation we reduce to A n Therefore, we pop Xi+1…Xk from the stack and push A onto the stack

Shift n If sk contains only incomplete items n n Then the right-sentential form

Shift n If sk contains only incomplete items n n Then the right-sentential form (X 1…Xkw) cannot be formed using a reduction Instead we simply “shift” the next input symbol onto the stack

Theorem 10. 10 n If L is L(G) for an LR(0) grammar G, then

Theorem 10. 10 n If L is L(G) for an LR(0) grammar G, then L is N(M) for a DPDA M n N(M) = the language accepted by empty stack or null stack

Proof n Construct from G the DFA D n n Stack Symbols of M

Proof n Construct from G the DFA D n n Stack Symbols of M are n n n Transition function: recognizes G’s prefixes Grammar Symbols of G States of D M has start state q and other states used to perform reduction

We know that: n If G is LR(0) then n n Reductions are the

We know that: n If G is LR(0) then n n Reductions are the only way to get the right -sentential form when the state of the DFA (on the top of the stack) contains a complete item When M starts on input w it will construct a right-most derivation for w in reverse order

What we need to prove: n n When a shift is called for and

What we need to prove: n n When a shift is called for and the top DFA state on the stack has only incomplete items then there are no handles (Note: if there was a handle, then some DFA state on the stack would have a complete item)

Suppose state A (complete item) n n n Each state is put onto the

Suppose state A (complete item) n n n Each state is put onto the top of the stack It would then immediately be reduced to A Therefore, a complete item cannot possibly become buried on the stack

Proof continued n n n The acceptance of G occurs when the top of

Proof continued n n n The acceptance of G occurs when the top of the stack contains the start symbol The start symbol by definition of LR(0) grammars cannot appear on the right side of a production L(G) always has a prefix property if G is LR(0)

Conclusion of Proof n n n Thus, if w is in L(G), M finds

Conclusion of Proof n n n Thus, if w is in L(G), M finds the rightmost derivation of w, reduces w to S, and accepts If M accepts w, then the sequence of right-sentential forms provides a derivation of w from S N(M) = L(G)

Corollary of Theorem 10. 10 n n Every LR(0) grammar is unambiguous Why? n

Corollary of Theorem 10. 10 n n Every LR(0) grammar is unambiguous Why? n The rightmost derivation of w is unique n (Given the construction we provided)

LR(1) Grammars n n n LR grammar with 1 look-ahead All and only deterministic

LR(1) Grammars n n n LR grammar with 1 look-ahead All and only deterministic CFL’s have LR(1) grammars Are greatly important to compiler design n Why? n n Because they are broad enough to include the syntax of almost all programming languages Restrictive enough to have efficient parsers (that are essentially DPDAs)

LR(1) Item n Consists of an LR(0) item followed by a look-ahead set consisting

LR(1) Item n Consists of an LR(0) item followed by a look-ahead set consisting of terminals and/or the special symbol $ n n General Form: n n $ = the right end of the string A , {a 1, a 2, …, an} The set of LR(1) items forms the states of a viable prefix by converting the NFA to a DFA

A grammar is LR(1) if n n The start symbol does not appear on

A grammar is LR(1) if n n The start symbol does not appear on the right side of any productions The set of items, I, valid for some viable prefix includes some complete item A , {a 1, …, an} then n n No ai appears immediately to the right of the dot in any item of I If B , {b 1, …, bk} is another complete item in I, then ai bj for any 1 i n and 1 j k

Accepting LR(1) language: n n n Similar to the DPDA used with LR(0) grammars

Accepting LR(1) language: n n n Similar to the DPDA used with LR(0) grammars However, it is allowed to use the next input symbol during it’s decision making This is accomplished by appending a $ to the end of the input and the DPDA keeps the next input symbol as part of the state

LR(1) Rules for Reduce/Shift n n n If the top set of items has

LR(1) Rules for Reduce/Shift n n n If the top set of items has a complete item A , {a 1, a 2, …, an}, where A S, reduce by A if the current input symbol is in {a 1, a 2, …, an} If the top set of items has an item S , {$}, then reduce by S and accept if the current symbol is $ (i. e. , the end of the input is reached) If the top set of items has an item A a. B, T, and a is the current input symbol, then shift

Regarding the Rules n n Guarantees that at most one of the rules will

Regarding the Rules n n Guarantees that at most one of the rules will be applied for any input symbol or $ Often for practicality the information is summarized into a table n n Rows: sets of items Columns: terminals and $