Normal Forms Chomsky Normal Form Griebach Normal Form
Normal Forms Chomsky Normal Form Griebach Normal Form cs 466(Prasad) L 8 Norm 1
• Language preserving transformations • Improve parsing efficiency • Prove properties about languages and derivations Shorter derivations cs 466(Prasad) L 8 Norm 2
Elimination of l-rules Reduces the length of the derivation cs 466(Prasad) L 8 Norm 3
• Aim: Restrict the grammar such that • Approach: – Introduce S’ cs 466(Prasad) L 8 Norm 4
– Add rules to capture the effect of l-rules to be deleted. – (Ensures non-contracting rules) cs 466(Prasad) L 8 Norm 5
Example cs 466(Prasad) L 8 Norm 6
Determination of nullable non-terminals Bottom-up flow of information cs 466(Prasad) L 8 Norm 7
Algorithm Nullable Nonterminals NULL : = {A | A->l e P}; repeat PREV : = NULL; foreach A e V do if there is an A-rule A->w and w e PREV* then NULL : = NULL U {A} until cs 466(Prasad) NULL = PREV; L 8 Norm 8
Proof of correctness • Soundness – If A e NULL(final) then A=>* l. • Induction on the number of iterations of the loop. • Completeness – If A=>* l then A e NULL(final). • Induction on the minimal derivation of the null string from a non-terminal. • Termination • Bounded by the number of non-terminals. cs 466(Prasad) L 8 Norm 9
Elimination of Chain rules Removing renaming rules: redundant procedure calls. Top-down flow of information cs 466(Prasad) L 8 Norm 10
Construction of Chain(A) : = {A}; PREV : = f; repeat NEW : = Chain(A) - PREV; PREV : = Chain(A); foreach B e NEW do if there is a rule B->C then Chain(A) : = Chain(A) U {C} until Chain(A) = PREV; cs 466(Prasad) L 8 Norm 11
Examples cs 466(Prasad) L 8 Norm 12
Elimination of useless symbols • A variable is useful if it occurs in a derivation that begins with the start symbol and generates a terminal string. • Reachable from S • Derives terminal string cs 466(Prasad) L 8 Norm 13
• Construction of the set of variables that derive terminal string. – Bottom-up flow of information • Similar to the computation of nullable variables. • Construction of the set of variables that are reachable – Top-down flow of information • Similar to the computation of chained variables. cs 466(Prasad) L 8 Norm 14
Examples B does not derive A unreachable. terminal string; C unreachable. Empty set of productions “Non-termination” cs 466(Prasad) L 8 Norm 15
Chomsky Normal Form • A CFG is in Chomsky Normal Form if each rule is of the form: • Theorem: There is an algorithm to construct a grammar G’ in CNF that is equivalent to a CFG G. cs 466(Prasad) L 8 Norm 16
Construction • Obtain an equivalent grammar that does not contain l-rules, chain rules, and useless variables. • Apply following conversion on rules of the form: cs 466(Prasad) L 8 Norm 17
Significance of CNF • Length of derivation of a string of length n in CNF = (2 n-1) (Cf. Number of nodes of a strictly binary tree with n-leaves) • Maximum depth of a parse tree = n • Minimum depth of a parse tree = cs 466(Prasad) L 8 Norm 18
Removal of direct left recursion • Causes infinite loop in top-down (depthfirst) parsers. • Approach: Generate string from left to right. cs 466(Prasad) L 8 Norm 19
Note that absence of direct left recursion does not imply absence of left recursion. cs 466(Prasad) L 8 Norm 20
(Cf. Gaussian Elimination) cs 466(Prasad) L 8 Norm 21
Griebach Normal Form (* Constructs terminal prefixes that facilitates discovery of dead-ends *) • A CFG is in Griebach Normal Form if each rule is of the form • Theorem: There is an algorithm to construct a grammar G’ in GNF that is equivalent to a CFG G. cs 466(Prasad) L 8 Norm 22
Analogy: solving linear simultaneous equations What are the values of x, y, and z? (Solving for z and then back substituiting. ) cs 466(Prasad) L 8 Norm 23
Introducing terminals as first element on RHS Elim ina ting left rec urs ion Example: conversion to GNF cs 466(Prasad) L 8 Norm 24
• The size of the equivalent GNF can be large compared to the original grammar. • Example CFG has 5 rules, but the corresponding GNF has 24 rules!! • Length of the derivation in GNF = Length of the string. • GNF is useful in relating CFGs (“generators”) to pushdown automata (“recognizers”/”acceptors”). cs 466(Prasad) L 8 Norm 25
- Slides: 25