IMPLEMENTATION OF FINITE AUTOMAT IN CODE There are

  • Slides: 16
Download presentation
IMPLEMENTATION OF FINITE AUTOMAT IN CODE There are several ways to translate either a

IMPLEMENTATION OF FINITE AUTOMAT IN CODE There are several ways to translate either a DFA or an NFA into code. Consider , again the example of a DFA that accepts identifiers consisting of a letter followed by a sequence of letters and/ or digits in its amended form that includes lookahead and the principal of longest substring. 1

IMPLEMENTATION OF FINITE AUTOMAT IN CODE (cont’d) Letter Start Letter {Other} In_id Digit 2

IMPLEMENTATION OF FINITE AUTOMAT IN CODE (cont’d) Letter Start Letter {Other} In_id Digit 2 Finish 2 Return ID

Simulation of the DFA 3 { Starting in state 1} If the next character

Simulation of the DFA 3 { Starting in state 1} If the next character is a letter then advance the input: { now in state 2} While the next character is a letter or a digit do advance the input { stay in state 2} End while; { go to state 3 without consuming input } Accept Else { Error or other cases } End if;

Constructing Transition Diagrams for Tokens 4 Ø Transition Diagrams (TD) are used to represent

Constructing Transition Diagrams for Tokens 4 Ø Transition Diagrams (TD) are used to represent the tokens Ø As characters are read, the relevant TDs are used to attempt to match lexeme to a pattern

Ø Each TD has: – – l 5 States : Represented by Circles Actions

Ø Each TD has: – – l 5 States : Represented by Circles Actions : Represented by Arrows between states Start State : Beginning of a pattern (Arrowhead) Final State(s) : End of pattern (Concentric Circles) Each TD is Deterministic - No need to choose between 2 different actions !

Example TDs Ø Recognition Of Relational Operators >=: start 0 > 6 = 7

Example TDs Ø Recognition Of Relational Operators >=: start 0 > 6 = 7 other 8 6 RTN(GE) * RTN(G) We’ve accepted “>” and have read other char that must be unread (means push back into input stream)

Example : All RELOPs start 0 < 1 = 2 return(relop, LE) > 3

Example : All RELOPs start 0 < 1 = 2 return(relop, LE) > 3 return(relop, NE) other = 4 5 * return(relop, LT) return(relop, EQ) > 6 7 = 7 other 8 return(relop, GE) * return(relop, GT)

Example TDs : id id : letter or digit start 9 letter 10 other

Example TDs : id id : letter or digit start 9 letter 10 other 11 * return( get_token(), install_id()) 8

Example TDs : Unsigned #s digit start 20 digit * 21 . 22 digit

Example TDs : Unsigned #s digit start 20 digit * 21 . 22 digit 23 other 24 * return(num, install_num()) digit start 9 25 digit 26 other 27 *

Implementing Transition Diagrams l class Scanner { l char _la; // The lookahead character

Implementing Transition Diagrams l class Scanner { l char _la; // The lookahead character Token next. Token() { start. Lexeme(); // reset window at start while(true) { switch(_state) { case 0: { _la = get. Char(); if (_la == ‘<’) _state = 1; else if (_la == ‘=’) _state = 5; else if (_la == ‘>’) _state = 6; else failure(state); }break; case 6: { _la = get. Char(); if (_la == ‘=’) _state = 7; else _state = 8; }break; } } } l l l l l l 10 } case 7: { return new Token(GEQUAL); }break; case 8: { push. Back(_la); return new Token(GREATER); }

Implementing Transition Diagrams lexeme_beginning = forward; state = 0; 11 FUNCTIONS USED nextchar(), forward,

Implementing Transition Diagrams lexeme_beginning = forward; state = 0; 11 FUNCTIONS USED nextchar(), forward, retract(), install_num(), install_id(), gettoken(), isdigit(), isletter(), recover() token nexttoken() { while(1) { switch (state) { case 0: c = nextchar(); start /* c is lookahead character */ if (c== blank || c==tab || c== newline) { repeat state = 0; until lexeme_beginning++; a “return” occurs /* advance beginning of lexeme */ } else if (c == ‘<‘) state = 1; else if (c == ‘=‘) state = 5; else if (c == ‘>’) state = 6; else state = fail(); break; … /* cases 1 -8 here */ 0 < 1 = 2 > 3 other = 4 * 5 > 6 = 7 other 8 *

Implementing Transition Diagrams, II 25 digit * 25 digit 26 . . . case

Implementing Transition Diagrams, II 25 digit * 25 digit 26 . . . case 25; case 26; case 27; 12 c = nextchar(); 27 advances forward if (isdigit(c)) state = 26; else state = fail(); Case numbers correspond to transition break; diagram states ! c = nextchar(); if (isdigit(c)) state = 26; else state = 27; break; retract(1); lexical_value = install_num(); return ( NUM ); retracts. . . forward other looks at the region lexeme_beginning. . . forward

Implementing Transition Diagrams, III. . . case 9: c = nextchar(); if (isletter(c)) state

Implementing Transition Diagrams, III. . . case 9: c = nextchar(); if (isletter(c)) state = 10; else state = fail(); break; case 10; c = nextchar(); if (isletter(c)) state = 10; else if (isdigit(c)) state = 10; else state = 11; break; case 11; retract(1); lexical_value = install_id(); return ( gettoken(lexical_value) ); letter or digit . . . 13 reads token name from ST 9 letter 10 other * 11

When Failures Occur: Init fail() { start = state; forward = lexeme beginning; switch

When Failures Occur: Init fail() { start = state; forward = lexeme beginning; switch (start) { case 0: start = 9; break; case 9: start = 12; break; case 12: start = 20; break; case 20: start = 25; break; case 25: recover(); break; default: /* lex error */ } return start; } 14 Switch to next transition diagram

What Else Does Lexical Analyzer Do? All Keywords / Reserved words are matched as

What Else Does Lexical Analyzer Do? All Keywords / Reserved words are matched as ids • After the match, the symbol table or a special keyword table is consulted • Keyword table contains string versions of all keywords and associated token values if 15 then 16 begin 17 . . . • When a match is found, the token is returned, along with its symbolic value, i. e. , “then”, 16 • If a match is not found, then it is assumed that an id has been discovered 15

ASSINGMENT Design a nondeterministic finite automata in c++. 16

ASSINGMENT Design a nondeterministic finite automata in c++. 16