Membership problem CYK Algorithm Project presentation CS 5800
Membership problem CYK Algorithm Project presentation CS 5800 Spring 2013 Professor : Dr. Elise de Doncker Presented by : Savitha parur venkitachalam
Membership problem • To determine if the given string is a member of the language defined by a context free grammar. • Given a context-free grammar G and a string w • G = (V, ∑ , P , S) where • • • V finite set of variables ∑ (the alphabet) finite set of terminal symbols P finite set of rules S start symbol (distinguished element of V) V and ∑ are assumed to be disjoint Is W in the language of G?
CYK Algorithm • Developed by J. Cocke D. Younger, T. Kasami to answer the membership problem • Input should be in Chomsky Normal form • A BC • A a • S λ where B, C Є V – {S} • Uses bottom up parsing • Uses dynamic programming or table filling algorithm • Complexity - O(n 3)
CYK basic Ideas • CYK works on two basic ideas 1. Consider rules satisfying substrings of length from 1 to N Let the string to search be abca First consider substring of length 1 – a , b , c, a Next step length 2– ab , bc , ca Next step length 3– abc , bca Final length 4 – abca
CYK basic ideas 2. longer substrings can be parsed from parsing shorter ones Eg: abc can be split as a. bc or ab. c if we know rules to form a and bc (or ab and c) then we know the rules to form abc A substring can be given as Si, j = (Si, i , Si+1, j ), (Si, i+1 , Si+2, j ) … (Si, j-1 , Sj, j ) i – start index and j- end index bcd can be formed from abcd as S 2, 4 = (S 2, 2 , S 3, 4) , (S 2, 3 , S 4, 4) = (b. cd) , (bc. d)
CYK table filling • Wi, j = (Wi, i , Wi+1, j ), (Wi, i+1 , Wi+2, j ) …… (Wi, j-1 , Wj, j ) • Fill the table with the rules satisfying the substrings • If the final box contains the start symbol then the string is a member of the language W 1, 4 W 1, 3 W 2, 4 W 1, 2 W 2, 3 W 3, 4 W 1, 1 W 2, 2 W 3, 3 W 1 W 2 W 4, 4 W 3 W 4
Table filling example Search string ‘cbba’ W 1, 4 W 1, 3 W 2, 4 W 1, 2 W 2, 3 W 3, 4 {B} {A , C} c b {A} b a
• To fill the next row of the table consider Wi, j = (Wi, i , Wi+1, j ), (Wi, i+1 , Wi+2, j ) …… (Wi, j-1 , Wj, j ) W 1, 2 = (W 1, 1 , W 2, 2) = {A, C} {B} = {AB , CB} Rules to form AB or CB = {S, C} W 2, 3 = (W 2, 2 , W 3, 3) = {B} = {B B} Rules to form BB = ∅ W 3, 4 = (W 3, 3 , W 4, 4) = {B} {A} = {B A} Rules to form BA = {C } W 1, 2 {A , C} W 2, 3 W 3, 4 {B} {A}
Table : W 1, 4 W 1, 3 W 2, 4 {S, C} ∅ {C} {B} {A , C} {A}
Wi, j = (Wi, i , Wi+1, j ), (Wi, i+1 , Wi+2, j ) …… (Wi, j-1 , Wj, j ) W 1, 3 = (W 1, 1 , W 2, 3 ), (W 1, 2 , W 3, 3 ) = {A, C} U {S, C} {B} = { A , C , SB , CB} Rules to form A or C or SB or CB = {C} W 2, 4 = (W 2, 2 , W 3, 4 ), (W 2, 3 , W 4, 4 ) = {B} {C} U {A} = { BC, A} Rules to form BC or A = {B} W 1, 3 W 2, 4 {S, C} ∅ {C} {B} {A , C} {A}
Table : W 1, 4 {C} {S, C} {A , C} {B} ∅ {C} {B} {A}
Wi, j = (Wi, i , Wi+1, j ), (Wi, i+1 , Wi+2, j ) …… (Wi, j-1 , Wj, j ) W 1, 4 = (W 1, 1 , W 2, 4 ), (W 1, 2 , W 3, 4 ) , (W 1, 3 , W 4, 4 ) = {A, C} {B} U {S, C} {C} U {C} {A} = { AB, CB , SC , CA} Rules to form AB or CB or SC or CA = {S, C, A} W 1, 4 {C} {S, C} {A , C} {B} ∅ {C} {B} {A}
Final Table : The first cell represents the original string and contains the start symbol ‘S’. Result : ‘cbba’ is a member of the language. {S , C , A } {C} {S, C} {A , C} {B} ∅ {C} {B} {A}
Design • Read the input grammar from a file or prompt user to input the rules • Check if the grammar is in CNF • If grammar is in CNF, start filling the table • Output : ‘String is a member of the input grammar’ Or ‘String is not a member of the input grammar’
References • http: //en. wikipedia. org/wiki/CYK_algorithm#Algorithm • http: //www. cs. ucdavis. edu/~rogaway/classes/120/winter 12/CYK. pd f • Languages and Machines, An Introduction to the Theory of Computer Science - Thomas A. Sudkamp • “Parsing” Internet: http: //qntm. org/top • http: //en. wikipedia. org/wiki/Parsing • http: //en. wikipedia. org/wiki/Dynamic_programming • http: //en. wikipedia. org/wiki/Bottom-up_parsing
Questions
- Slides: 16