Chomsky Normal Form CNF of CFGs Purpose Definition
- Slides: 44
Chomsky Normal Form (CNF) of CFG’s Purpose Definition Method of Construction 1
CNF: Purpose • A construct used to establish properties of context-free languages (CFLs) • Every CFL without e can be generated by a CFG in Chomsky normal form. • To show that a language without e is a CFL it is sufficient to show that it has a CFG in Chomsky normal form. • Typical approach to closure properties 2
CNF: Definition A context free grammar (CFG) in which all production are of the form A->BC or A->a, where A, B and C are variables and a is a terminal 3
CNF construction: 3 elimination task Eliminate “useless” symbols Variables or terminals that do not appear in any derivation of a terminal string from the start symbol Eliminate e-productions A->e Eliminate unit-productions A->B for variables A and B 4
Generating and Reachable Symbols • X is generating if X =>* w (terminal string) • If X is a terminal, then it can generate itself in zero steps. • X is reachable if S =>* Xb for some and b, (S is a start symbol) • Any symbol that is not generating and reachable is useless 5
Induction to find generating variables Basis: If there is a production A -> w, where w is a terminal string, then A is generating. Induction: If there is a production A -> , where consists only of terminals and variables known to derive a terminal string, then A derives a terminal string; hence is generating. 6
Algorithm to eliminate nongenerating variables 1. Discover all variables that derive terminal strings. 2. For all other variables, remove all productions in which they appear either on the LHS or RHS of ->. 7
Exercise 7. 1. 1 text p 275 S->AB|CA A->a B->BC|AB C->a. B|b Eliminate non-generating variables. Do on board 8
Exercise 7. 1. 1 text p 275 S->AB|CA generating because A and C are generating A->a generating B->BC|AB C->a. B|b generating B is a non-generating variable. No way it can be used to generate a terminal string. Remove all productions that involve B on either side of ->. New CFG with only useful variables A->a C->b S->CA 9
Eliminating non-generating variables may lead to unreachable variables. Example: S->AB|C, A->a. A|a, B->b. B, C->c A and C are generating. Why? S is generating. Why? B is not generating. Why? What remains after eliminating B? 10
Eliminating non-generating variables may lead to unreachable variables. Example: S->AB|C, A->a. A|a, B->b. B, C->c A and C are generating. A->a and C->c. S is generating. S->C. B is not generating. Cannot be used to generate a terminal sting What remains after eliminating production with B? S->C and C->c A->a. A|a unreachable 11
Finding reachable symbols Basis: Obviously, start symbol is reachable. Induction: if we can reach A, and there is a production A-> , then we can reach all the symbols in . 12
Epsilon Productions Theorem: If L is a CFL with no empty string, then it has a CFG which can be put in CNF with no e-productions. A->e is clearly an e-production. To eliminate all types e-productions, we must first discover the nullable variables, i. e. variables B such that B =>* ε. 13
Inductive definition of nullable symbols Basis: If there is a production A -> ε, then A is nullable. Induction: If there is a production A -> , and all symbols in are nullable, then A is nullable. 14
Example: Nullable Symbols S->AB, A->a. A|ε, B->b. B|A A is nullable because of A -> ε. B is nullable because of B -> A. S is nullable because of S -> AB. 15
Algorithm to eliminate e-productions Identify all nullable symbols. Consider each production A->X 1…Xn that contains nullable symbols If A->X 1…Xn contains m<n nullable symbols Construct a family of productions with 2 m members that are all combinations of nullable symbols present or absent If m=n exclude case with all symbols absent 16
Eliminating e-productions The new CFG with no e-productions consist of all families of productions derived from productions with nullable symbols plus, All productions from the original CFG that did not contain nullable symbols 17
Example: Eliminating ε-Productions S->ABC, A->a. A|ε, B->b. B|ε, C->ε Which variables are nullable and why? What family of productions comes from S->ABC? What family comes from A->a. A? What family comes from B->b. B? Do on board 18
Example: Eliminating ε-Productions S->ABC, A->a. A|ε, B->b. B|ε, C->ε A, B, C, and S are all nullable. Productions S->ABC|AB|AC|BC|A|B|C come from S->ABC Productions A->a. A|a come from A->a. A Productions B->b. B|b come from B->b. B 19
Example: Eliminating ε-Productions S->ABC, A->a. A|ε, B->b. B|ε, C->ε Any productions from original CFG? Yes A->e, B->e, C->e Remove these S -> ABC | AB | AC | BC | A | B | C A -> a. A | a B -> b. B | b What is the effect of eliminating C->e? 20
Eliminating ε-Productions continued C is not generating Eliminate C in productions of the new CFG S -> ABC | AB | AC | BC | A | B | C A -> a. A | a B -> b. B | b 21
Define Unit Productions A unit production is a production whose right side consists of exactly one variable. A->a is not a unit production because a is terminal Eliminating unit production by expansion is the most common approach 22
Eliminate by expansion In the CFG defined by E->T|E+T T->F|T*F F->I|(E) I->a|Ia What are the unit productions? 23
Eliminate by expansion In the CFG defined by E->T|E+T T->F|T*F F->I|(E) I->a|Ia In a sequence of unit productions, elimination by expansion starts at the bottom. Do on board 24
Eliminate by expansion In the CFG defined by E->T|E+T T->F|T*F F->I|(E) I->a|Ia Keep I->a|Ia Expand F->I|(E): F->a|Ia|(E) Expand T->F|T*F: T->a|Ia|(E)|T*F Expand E->T|E+T: E->a|Ia|(E)|T*F|E+T 25
Cleaning up a CFG Theorem: if L is a CFL, then there is a CFG for L – {ε} that has: • No ε-productions. • No unit productions. • No useless symbols. 26
Proof by construction Start with a CFG for L. Perform the following steps in order: 1. Eliminate ε-productions. (most be 1 st because it can create unit production and useless variables) 2. Eliminate unit productions. 3. Eliminate variables that derive no terminal strings. 4. Eliminate variables not reachable from the start symbol. 27
Chomsky Normal Form In addition to being cleaned up, a CFG is said to be in Chomsky Normal Form if every production is of one of two forms: A -> BC (right side is two variables). A -> a (right side is a single terminal). Theorem: If L is a CFL, then L – {ε} has a CFG in CNF. 28
Proof by construction Step 1: “Clean” the CFG, so every production has right side either a single terminal or a combination of terminals and variables with length >2. Step 2: For each right side not a single terminal, make the right side all variables. If terminal a prevents RHS from being all variables, create new variable Aa and production Aa -> a. Replace a by Aa in right sides of productions. 29
Example: Step 2 Consider production A -> Bc. De. We need variables Ac and Ae. with productions Ac -> c and Ae -> e. Replace A -> Bc. De by A -> BAc. DAe. If c and/or e occur in other production, replace then by Ac and/or Ae 30
Clean to CNF Step 2: For each right side not a single terminal, make the right side all variables. Step 3: Break right sides longer than 2 into a chain of productions with right sides of two variables using “cascade of productions” text p 273 Do not combine steps 2 and 3. Show all strings of variables before applying cascading productions Example of cascading productions: A->B 1 B 2 B 3 B 4 is replaced by A->B 1 C 1, C 1 ->B 2 C 2, and C 2 ->B 3 B 4. 31
Cascade of productions is required There are many ways to get RHS with 2 variables “cascade of productions” is a unique result Note in the previous example, A->B 1 B 2 B 3 B 4 replaced by A->B 1 C 1, C 1 ->B 2 C 2, and C 2 ->B 3 B 4 that the 1 st variable on RHS of the new productions is in the same order as in the original production. Example: A->B 1 B 2 B 3 B 4 B 5 is replaced by? Do on board 32
Example: A->B 1 B 2 B 3 B 4 B 5 A->B 1 C 1 C 1 ->B 2 C 2 C 2 ->B 3 C 3 C 3 ->B 4 B 5 33
Assignment 13 Exercise 7. 1. 2 text p 275 and 277 S->ASB|e A->a. AS|a B->Sb. S|A|bb Clean and convert to CNF 34
Example: Ex 7. 8 text p 266 Clean the following CFG S->AB A->a. AA|e B->b. BB|e Perform the following steps in order: 1. Eliminate ε-productions. 2. Eliminate unit productions. 3. Eliminate variables that derive no terminal strings. 4. Eliminate variables not reachable from the start symbol. Do on board 35
Convert cleaned version of CFG to CNF For each right side not a single terminal, make the right side all variables. Break right sides longer than 2 into a chain of productions with right sides of two variables using “cascade of productions” Do on board 37
Sometimes elimination of unit production by expansion does not work Will not work on cycles of unit productions A->B, B->C, and C->A Alternative: find all pairs (A, B) such that A=>*B by a sequence of unit productions If B-> is a non-unit production, then add production A-> and drop all the unit productions in the sequence A=>*B. (i. e. , A-> directly instead of through B via unit productions) 40
Pair search defined by induction Find all pairs (A, B) such that A=>*B by a sequence of unit productions only. Basis: A=>*A, therefore (A, A) selected. Induction: If we have found (A, B), and B->C is a unit production, then add (A, C) to the pair list. 41
Example of pair search for CFG E->T|E+T T->F|T*F F->I|(E) I->a|Ia [E, E] basis [T, T] basis [F, F] basis [I, I] basis [E, T] E->T [T, F] T->F [F, I] F->I [E, F] T->F [T, I] F->I [E, I] F->I Associate each pair with a non-unit production 42
Combine pair search with non-unit productions (E, E) E->E+T (E, T) E->T*F (E, F) E->(E) (E, I) E->a|Ia (T, T) T->T*F (T, F) T->(E) (T, I) T->a|a. I (F, F) F->(E) (F, I) F->a|Ia (I, I) I->a|Ia Original CFG with unit productions E->T|E+T T->F|T*F F->I|(E) I->a|Ia New CFG with no unit productions E->E+T|T*F|(E)|a|Ia T->T*F|(E)|a|Ia F->(E)|a|a. I I->a|Ia same as by expansion 43
Quiz 5 Wednesday 11/10/21 44
- Contoh bentuk normal chomsky
- Context free grammar to chomsky normal form calculator
- Chomsky normal form
- Chomsky normal form 설명
- Chomsky normal form exercises
- Chomsky normal form 설명
- Chomsky and greibach normal form
- Chomsky normal form examples
- Chomsky normal form
- Pdas and cfgs are equivalent
- Cfgs are more powerful than
- Cfg are more powerful than
- Bentuk normal chomsky
- Bentuk normal chomsky
- Normal chomsky
- Lenguajes libres de contexto
- Nocl molecular shape
- Note di trattazione scritta modello cnf
- Pnf vnf cnf
- Wff to cnf
- Cnf korea
- Dnf vs cnf
- Cnf lewis structure
- Fob and cnf
- Hymen
- Hierarki chomsky
- Krashen vs cummins
- Tzarget
- Piaget language acquisition
- Chomsky nativism
- Gramática gerativa
- Chomsky krashen
- Hierarquia de chomsky
- Chomsky hierarchy computer science
- Chomsky computational linguistics
- Nature vs nurture language development
- Language development stages
- Carol chomsky
- Diagram sintaks
- Jerarquia de chomsky
- Embodied cognition ap psychology definition
- Sentaks nedir
- Pumping lemma for context-free languages examples
- Railroad diagram
- Chomsky