Digital State Machines Finite Automata Regular Languages Chapter
- Slides: 126
Digital State Machines Finite Automata & Regular Languages
Chapter Outline u u Introduction Finite-State Automata Regular Languages and Finite-State Automata Summary 04 December 2020 Veton Këpuska 2
Introduction: Finite State Automata u Finite-state automaton is one of the most significant tools of computational linguistics. Its variations: n Finite-state transducers n Hidden Markov Models, and n N-grammars Important components of the Speech Recognition and Synthesis, spellchecking, and information-extraction applications. u The FSA theory was designed in the beginning of computer science as a model of abstract computing machines pioneered by the work Allan Turing. n n FSA’s are devices that accept-recognize or reject an input stream of characters. FSA’s are very efficient in term of speed and memory The most frequent usage of Finite-State Automata is searching words or phrases. Additional uses in application areas such as: u u u Morphological parsing, Parts of speech annotation, and Speech Processing and Recognition. 04 December 2020 Veton Këpuska 3
Example of Finite State Automata u This FSA accepts (recognizes) or generates strings like: n n n ac abbc abbbc, abbbbbc, etc. 04 December 2020 Veton Këpuska 4
Introduction: D-FSA vs. ND-FSA u Adding non-determinism to FSA will not allow us define any language that can not be defined by deterministic FSAs. u Why then bother with ND-FSAs: n n n It turns out that there can be substantial efficiency in describing an application using ND-FSAs allows us to program solutions to problems using a higher -level language. This program then is compiled, by the algorithm (that we will learn in this chapter), into a deterministic FSA that can be executed on a conventional computer. 04 December 2020 Veton Këpuska 5
Finite State Automata An Informal Description of Finite State Automata
Finite Automata u u u Study extended example of a real-world problem whose solution uses finite automata. Investigate protocols that support “electronic money” – files that: n a customer can use to pay for goods on the internet, retains a copy of the same file to spend again, and n a seller can receive with assurance that “money” is real. It must know that the file has not been forged, nor has it been copied and sent to the seller. Nonforgeability of the file must guaranteed by a third party – a bank and by a cryptography policy. n Encryption of the money files ensures that forgery is not a problem. n Bank must also keep a database of al the valid money that it has issued: u It can verify to a store that the file it has recived represents real money and can be credited to the store’s account. n Encryption is not going to be addressed as it is beyond the scope of the topic covered in this class. 04 December 2020 Veton Këpuska 7
Finite Automata u Nevertheless, in order to use electronic money, protocols need to be devised to allow the manipulation of the money in a variety of ways that the users want. n Monetary systems always invite fraud, and the protocol must verify whatever policy is adopted regarding home money is used. n The solution needs to ensure that the only things that can happen are things we intend to happen: an unscrupulous user will not be allowed to steal from others or to “manufacture” money. 04 December 2020 Veton Këpuska 8
The Ground Rules u u u The participants: n The customer n The store n The bank Only one money file in existence (for simplicity) The customer: n Pay, which initiates transfer of “his” money file to the store, or n Cancel the transfer, effectively asking the bank to place the money back in the customer’s account. The store: n Ship goods to the customer, n Redeem the money, effectively asking the bank to transfer the money to the store’s account. The bank: n Transfer the money by creating a new, suitable encrypted money file and sending it to the store. 04 December 2020 Veton Këpuska 9
The Protocol u u u The customer – n Assume that the customer can not be relied to act responsibly. u Customer may try to copy the money file, u Use the same money file to pay several times, or both The bank – n Assuming that the bank must behave responsibly, or it can not be a bank. u It must ensure that tow stores cannot both redeem the same money file, u It will not allow money file to be both canceled and redeemed. The store – n Will not ship goods until it is sure it has been given valid money. 04 December 2020 Veton Këpuska 10
The Protocol u FSA can represents the protocols as the one being discussed. n States – will represent each possible “state”/situation that each participants could be in. u The state remembers important events that have happened, u Also it knows which ones did not yet happen. n Transitions – occur between states whone of the five events described previously occur. 04 December 2020 Veton Këpuska 11
FSAs for Money Transfer Example Bank: n Beginning State is state “ 1” u u n Cancel request u u n u Veton Këpuska Bank restores the money and enders state 2. Bank can not leave state 2 since it can not allow the same money to be canceled again or to be spent by the customer. Redeem request u u 04 December 2020 The bank has issues a money file No requests have been made to either redeem it or cancel it. Enters state 3, and Initiates transfer and upon completion enters state 4. In state 4 it will no longer accept cancel, nor redeem requests, nor will it perform any other transactions regarding this particular money file. 12
FSAs for Money Transfer Example Store: Procedures in the store assumed to be imperfect. n n Beginning State is state “a” Pay request u u n Customer orders the goods by performing pay action. Enters state “b” and initiates both shipping and redemption process. Ship and Redeem request u u Enters state c or d in any order, and Initiates redeem /transfer or ship and enters state e/f or e. Customer: n Pay and Cancel request u 04 December 2020 Veton Këpuska Can do them any number of times and in any order. 13
Enabling Automata to Ignore Actions Missing transitions: u Store is not affected by a “cancel” action. n n u Effects of unexpected actions: n n u According to the formal definition of FSA (next) whenever an input X is received by an automaton, the automaton must follow an arc labeled X from the state that it is in to a new state. Store FSA must me augmented with transitions that correspond to “cancel” actions. Customer executed “pay” action second time, while store is in state e. Since store automaton does not have an arc corresponding to pay action in that state it will case FSA to “die”. The two kinds of actions that must be ignored by FSA’s: 1. Actions that are irrelevant to the participant involved. u u u 2. For the store FSA : “cancel” action. For the bank FSA: “pay” and “ship” For the customer FSA: “ship”, “redeem” and “transfer” Actions that must not be allowed to kill an automaton. u u For the store FSA: customers second “pay”, or “cancel” actions should not be allowed to kill its FSA. For the bank FSA: stores multiple “redeem” actions should be ignored. 04 December 2020 Veton Këpuska 14
Completed FSA’s 04 December 2020 Veton Këpuska 15
Complete System as FSA u Previous models accounted actions of each participants independently. n n u Customer’s FSA is simple – no-matter what actions are taken it resides in the same state. Bank’s and Store’s FSAs are complex and it is not immediately obvious in what combinations of states these tow automata can be. Product Automaton: n n n The normal way to explore the interaction of automata is to construct product automaton. New product FSA states are composed of pairs of states from each original FSAs: (3, d) – state denotes the situation where the bank is in state 2 and store in state d. Bank = 4 states, Store = 7 states, Product FSA = 4 x 7=28 states 04 December 2020 Veton Këpuska 16
Product Automaton for the Store and Bank 04 December 2020 Veton Këpuska 17
Product Automaton u u u Each of the two component of the product automaton independently makes transitions on the various inputs. If an input action is received, and one fo the two automata has no sate to go to on that input, then the product automaton “dies”; it has no state to go to. Formal Rule: n n n Assume (bank, store) product automaton being in state (i, x). Let Z be one of the input actions. Observe if there is a transition from state i under the input Z. Suppose there is a transition to state j. Similarly, observe if there is a transition from state x under the same input Z to state y. Thus, there is a transition from (i, x) to state (j, y) under input Z. If any of the states j or y do not exist than there is not transition arc labeled Z from (i, x). Example: u Consider the input redeem. If bank receives a redeem message when in state 1, it goes to state 3. If it in state 3 or 4 it stays there. If in state 2 the bank automaton dies. 04 December 2020 Veton Këpuska 18
Using Product Automaton to validate the Protocol u u u Only 10 states are accessible from start state Example of states that are not accessible. Real purpose of analyzing a protocol such as this one using automata is to ask and answer questions that mean: “Can the following type of error occur? ” Example: “Is it possible that the store can ship goods and never get paid? ” State is c, e, or g and no transition on input T was ever made? Problem State (2, c) 04 December 2020 Veton Këpuska 1 3 5 2 4 6 ? 7 8 9 10 19
Deterministic Finite State Automaton Formalism of a Deterministic Finite State Automaton Veton Këpuska
Deterministic Finite State Automaton u “Deterministic” refers to the fact that on each input there is one and only one state to which the automaton can transition from its current state. u Non-deterministic automaton can transition from its present state to more than one states on the same input. 04 December 2020 Veton Këpuska 21
Definition of D-FSA u A deterministic Finite State Automaton consists of: 1. A finite set of states – Q 2. A finite set of input symbols, 3. A transition function, , that takes as arguments: u a state, and u an input symbol, and u returns a state : 4. A start state, q 0, one of the states in Q 5. A set of final, or accepting, states F. F Q u Five-tuple notation of a D-FSA named A: A=(Q, , , q 0, F) 04 December 2020 Veton Këpuska 22
Formal Definition of Automaton Q={q 0, q 1, …, q. N} A finite set of N states a finite input alphabet of symbols q 0 the start state F the set of final states, F ⊆ Q δ(q, i) 04 December 2020 the transition function or transition matrix between states. Given a state q ∈ Q and an input symbol i ∈ , δ(q, i) returns a new state q′ ∈ Q. δ is thus a relation from Q×S to Q; Veton Këpuska 23
String Processing with D-FSA u u Suppose a 1 a 2…an is a sequence of inputs symbols. Initial state of D-FSA is its start state q 0, then 1. q 1= (q 0, a 1) 2. q 2= (q 1, a 2) … i. qi= (qi-1, ai) … n. qn= (qn-1, an) u If qn F then the input a 1 a 2…an sequence “accepted” otherwise it is “rejected”. 04 December 2020 Veton Këpuska 24
D-FSA Example u Using FSA to Recognize Sheeptalk “baa…!” 04 December 2020 Veton Këpuska 25
FSA Use u The FSA can be used for recognizing (we also say accepting) strings in the following way. First, think of the input as being written on a long tape broken up into cells, with one symbol written in each cell of the tape, as figure below: 04 December 2020 Veton Këpuska 26
Recognition Process u 1. The machine starts in the start state (q 0), and iterates the following process: Check the next letter of the input. a. b. 2. If it matches the symbol on an arc leaving the current state, then i. cross that arc ii. move to the next state, also iii. advance one symbol in the input If we are in the accepting state (q 4) when we run out of input, the machine has successfully recognized an instance of sheeptalk. If the machine never gets to the final state, a. b. c. either because it runs out of input, or it gets some input that doesn’t match an arc (as in Fig in previous slide), or if it just happens to get stuck in some non-final state, we say the machine rejects or fails to accept an input. 04 December 2020 Veton Këpuska 27
FSA For “Shpeep. Talk” Example u Q = {q 0, q 1, q 2, q 3, q 4}, u = {a, b, !}, // Sheep Language u F = {q 4}, and u δ(q, i) // Defined in next slide 04 December 2020 Veton Këpuska 28
State Transition Table Input State b a ! → 0 1 Ø Ø 1 Ø 2 Ø 3 Ø 3 4 *4 Ø Ø Ø We’ve marked state 4 with a * to indicate that it’s a final/accepting state (you can have as many final states as you want), and the Ø indicates an illegal or missing transition. We can read the first row as “if we’re in state 0 and we see the input b we must go to state 1. If we’re in state 0 and we see the input a or !, we fail”. 04 December 2020 Veton Këpuska 29
Deterministic Algorithm for Recognizing a String function D-RECOGNIZE(tape, machine) returns accept or reject index←Beginning of tape current-state←Initial state of machine loop if End of input has been reached then if current-state is an accept state then return accept else return reject elsif transition-table[current-state, tape[index]] is empty then return reject else current-state←transition-table[current-state, tape[index]] index←index + 1 end 04 December 2020 Veton Këpuska 30
Tracing Execution for Some Sheep Talk Before examining the beginning of the tape, the machine is in state q 0. Finding a b on input tape, it changes to state q 1 as indicated by the contents of transition-table[q 0, b] in Fig. It then finds an a and switches to state q 2, another a puts it in state q 3, a third a leaves it in state q 3, where it reads the “!”, and switches to state q 4. Since there is no more input, the End of input condition at the beginning of the loop is satisfied for the first time and the machine halts in q 4. State q 4 is an accepting state, and so the machine has accepted the string baaa! as a sentence in the sheep language. 04 December 2020 Veton Këpuska 31
Fail State u The algorithm will fail whenever there is no legal transition for a given combination of state and input. The input abc will fail to be recognized since there is no legal transition out of state q 0 on the input a, (i. e. , this entry of the transition table has a Ø). u Even if the automaton had allowed an initial a it would have certainly failed on c, since c isn’t even in the sheeptalk alphabet! We can think of these “empty” elements in the table as if they all pointed at one “empty” state, which we might call the fail state or sink state. u In a sense then, we could FAIL STATE view any machine with empty transitions as if we had augmented it with a fail state, and drawn in all the extra arcs, so we always had somewhere to go from any state on any possible input. Just for completeness, next Fig. shows the FSA from previous Figure with the fail state q. F filled in. 04 December 2020 Veton Këpuska 32
Adding a Fail State to FSA 04 December 2020 Veton Këpuska 33
Example u Suppose we have a D-FSA that accepts all and only the strings of 0’s and 1’s that have the sequence 01 somewhere in the string. We can write this language L as follows: {w|w is of the form x 01 y for some strings x and y consisting of 0’s and 1’s} u Equivalent description is: {x 01 y | x and y are any strings of 0’s and 1’s} u Example strings in this language L include 01, 110110, 100011. u Example strings not in this language L are ∊, 0, and 111000. 04 December 2020 Veton Këpuska 34
Example u u u What can be said about this D-FSA (A) that accepts this language L. = {0, 1} It has a number (of yet unknown) set of states with one of them say q 0 a starting state. It has to remember some important facts about what inputs it has seen so far. This is necessary to decide whether 01 is a substring of the input. A needs to remember: 1. 2. 3. Has it already seen 01? If yes than it will be in accepting state from now on. Has not seen 01, but its most recent input was 0, thus if now sees a 1, it will have seen 01 and can accept everything it sees from here on? Has not seen 01, but its last input was either nonexistent (it just started) or it has saw a 1? In this case A cannot accept until it first sees a 0 and then sees a 1 immediately after. 04 December 2020 Veton Këpuska 35
Example u Each condition presented in previous slide can be represented by a state. n Condition (3) is represented by the start (first) state q 0: 1 q 0 n If we are in the q 0 state, and next input is “ 0” we are then governed by condition (2): 1 q 0 04 December 2020 0 0 Veton Këpuska q 2 36
Example u u If we are in the state (2) and we receive input “ 1” – FSA should transit to the accepting state, which in case we choose to name it state q 1. Finally in accepting state q 1 any combination of 0’s and 1’s should not change the state. Thus Q = {q 0, q 1, q 2} and F={q 1} 1 q 0 0 0 1, 0 q 2 1 q 1 A=({q 0, q 1, q 2}, {0, 1}, , q 0, {q 1}) 04 December 2020 Veton Këpuska 37
Simpler Notations for D-FSA u A five-tuple with a detailed description of the transitions is both tedious and hard to read. There are two preferred notations: 1. A transition diagram, which is a graph such as the ones we have seen previously. 2. A transition table, which is a tabular listing of the function, which provides the set of states and the input alphabet. 04 December 2020 Veton Këpuska 38
Transition Diagrams u A transition diagram for a FSA A=(Q, , , q 0, F) is a graph defined as follows: 1. For each state in Q there is a node 2. For each state q in Q and each input symbol a in , let (q, a)=p. The transition diagram has an arc from node q to node p, labeled a. If there are several input symbols that cause transitions from q to p, then the transition diagram can have one arc, labeled by the list of these symbols. 3. There is an arrow into the start state q 0, labeled Start. 4. Nodes corresponding to accepting states (set F) are marked with double circle. 04 December 2020 Veton Këpuska 39
Example A=(Q, , , q 0, F) A=({q 0, q 1, q 2}, {0, 1}, , q 0, {q 1}) 04 December 2020 Veton Këpuska 40
Transition Tables u Transition table is a conventional, tabular representation of a function like that takes two arguments and returns a value. n Rows – correspond to states n Columns – correspond to inputs Input 0 1 → q 0 q 2 q 0 *q 1 q 1 q 2 q 1 State Transition table for the DFSA of previous example 04 December 2020 Veton Këpuska 41
Extending the Transition Function to Strings u D-FSA defines a language: n The set of all strings that result in a sequence of state transitions from the start state to an accepting state, or alternatively n The set of labels along all the paths that lead from the start state to any accepting state - in terms of the transition diagram. u Formulate precisely the notation of the language expressed by D-FSA: n n Define extended transition function of It describes what happens when we start in any state and follow any sequence of inputs. 04 December 2020 Veton Këpuska 42
Definition of Extended Transition Function BASIS: u If we are in state q and read no inputs, then we are still in state q. INDUCTION: u Suppose w is a string of the form xa; n w = 1101 x = 110 & a = 1 04 December 2020 Veton Këpuska 43
Example u u Design D-FSA to accept the language: L={w|w has both an even number of 0’s and 1’s} Solution: n Use states to count how many 0’s and 1’s has seen. Since even number requires counting modulo 2 we need to have 2 states for each symbol of the alphabet total of 4. n = {0, 1} n Q = {q 0, q 1, q 2, q 3} n q 0 – both number of 0’s and 1’s seen so far is even Accepting State; F = {q 0} n q 1 – number of 0’s is even and number of 1’s seen so far is odd n q 2 – number of 0’s is odd and number of 1’s seen so far is even n q 3 – number of 0’s and 1’s seen so far is odd 04 December 2020 Veton Këpuska 44
Transition Diagram of D-FSA 04 December 2020 Veton Këpuska 45
Transition Table Input 0 1 *→ q 0 q 2 q 1 q 0 q 2 q 0 q 3 q 1 q 2 State 04 December 2020 Veton Këpuska 46
Test u The check involves computing for an input, say w=110101 starting from ∊. 04 December 2020 Veton Këpuska 47
Formal Languages u Key Concept #1. Formal Language: n A model which can both generate and recognize all and only the strings of a formal language acts as a definition of the formal language. n A formal language is a set of strings, each string composed of symbols from a finite symbol-set called an alphabet (the same alphabet used above for defining an automaton!). u The alphabet for a “sheep” language is the set = {a, b, !}. u Given a model m (such as FSA) we can use L(m) to mean “the formal language characterized by m”. u L(m)={baa!, baaaa!, baaaaa!, …. } 04 December 2020 Veton Këpuska 48
The Formal Language Defined by D-FSA u Language defined by D-FSA A=(Q, , , q 0, F), denoted as L(A) is defined as: u That is, the language A is the set of strings w that take the start state q 0 to one of the accepting states of D-FSA. If L is L(A) for a D-FSA, then we say L is a regular language. 04 December 2020 Veton Këpuska 49
Homework: u 2. 2. 1, 2. 2. 2, 2. 2. 3, 2. 2. 4, 2. 2. 5, 2. 2. 6, 2. 2. 7, 2. 2. 8, 2. 2. 9, 2. 2. 10 04 December 2020 Veton Këpuska 50
Nondeterministic Finite State Automaton Veton Këpuska
An Informal View of N-FSA u u N-FSA has n a finite set of states, n a finite set of input symbols, n one start state, n one or more accepting states. N-FSA is different from D-FSA in the type of : n It takes a state and input symbol as arguments and returns a set of zero, one, or more states. 04 December 2020 Veton Këpuska 52
Deterministic vs Non. Deterministic FSAs: Example Deterministic FSA Non-Deterministic FSA 04 December 2020 Veton Këpuska 53
Deterministic vs Non-deterministic FSA u Deterministic FSA is one whose behavior during recognition is fully determined by the state it is in and the symbol it is looking at. u The FSA in the previous slide when FSA is at the state q 2 and the input symbol is a we do not know whether to remain in state 2 (self-loop transition) or state 3. Clearly the decision dependents on the next input symbols. 04 December 2020 Veton Këpuska 54
Non-Deterministic FSA (N-FSA) u N-FSA power is in its ability to possibly be in more than one state at the same time. n Useful in string matching were N-FSA would “guess” if it is seeing a beginning of a string. n Example of this introduced later. u Define nondeterministic FSA and show that it accepts a language that is also accepted by a DFSA. n n n N-FSA’s accept regular languages as D-FSA’s N-FSA’s are more succinct and are easier to design N-FSA’s can be converted to D-FSA’s 04 December 2020 Veton Këpuska 55
N-FSA Example u The figure below depicts a N-FSA. It accepts all and only the strings of 0’s and 1’s that end in “ 01”. u On “ 0” in q 0 state the N-FSA can transition to both q 0 and q 1 (in anticipation of next “ 1”). u Example of state sequence for the input 00101 is presented in the next slide. 04 December 2020 Veton Këpuska 56
State Transitions for 00101 04 December 2020 Veton Këpuska 57
Definition of Nondeterministic Finite State Automaton N-FSA: A=(Q, , , q 0, F) 1. 2. 3. 4. 5. Q – a finite set of states – a finite set of input symbols q 0 – start state, a member of set Q F – a set of final/accepting states, a subset of Q - transition function that takes a state in Q and an input symbol in as arguments and returns a set of states; a subset of Q. 04 December 2020 Veton Këpuska 58
N-FSA Definition Input 0 1 {q 0, q 1} {q 0} q 1 ∅ {q 2} * q 2 ∅ ∅ State → q 0 04 December 2020 Veton Këpuska 59
The Extended Transition Function for N-FSA u Similarly to D-FSA’s we need to extend the transition function of an N-FSA to function that takes a state q and a string of input symbols w, and returns the set of sates that the N-FSA is in; if it starts in the state q and processes the string w. u Slide State Transitions for 00101 depicts the function 04 December 2020 Veton Këpuska 60
Definition of Extended Transition Function BASIS: u If we are in state q and read no inputs, then we are still in state q. INDUCTION: u Suppose w is a string of the form xa, where a is a final symbol of w and x is the rest. Also Let Then 04 December 2020 Veton Këpuska 61
Example u Decreibe the processing of N-FSA for w=00101 starting from ∊. 04 December 2020 Veton Këpuska 62
The Language of an N-FSA u N-FSA will accept a string w if it is possible to make any sequence of choices of next state, while reading the symbols of w, and go from start state to any of accepting states. n The fact that other choices on the same input w lead to a non-accepting state or do not lead to any state at all (i. e. , the sequence of states “dies”), does not prevent w from being accepted by the N-FSA as a whole. u Formally: If A=(Q, , , q 0, F) is an N-FSA, then That is, L(A) is the set of string w in * such that contains at least one accepting state 04 December 2020 Veton Këpuska 63
Example u u u Formally prove that the N-FSA of the previous example accepts the language L={w|w ends in 01}. Proof is a mutual induction of the following three statements that characterize three states: 1. contains q 0 for every w. 2. contains q 1 if and only if w ends in 0. 3. contains q 2 if and only if w ends in 01. To prove the above statements, we need to consider how A (NFSA), can reach state; i. e. , n n u what was the last input symbols, and what state was A just before reading that symbol. The proof of theorem is an induction on | w |, the length of w starting from length 0. 04 December 2020 Veton Këpuska 64
Proof (Basis) BASIS: If |w| = 0, then w = ∊. u Statement (1) follows from the basis definition: 1. u Statement (2) is true because: n n u n n 2. contains q 1 if and only if w ends in 0. 3. contains q 2 if and only if w ends in 01. ∊ is empty set and it does not end with 0, and from definition above it does not contain q 1 Statement (3) is true because: ∊ is empty set and it does not end with 01, and from basis definition above it does not contain q 1 04 December 2020 Veton Këpuska contains q 0 for every w. 65
Proof (Statement 1) INDUCTION: Assume w=xa, a is a symbol of alphabet (i. e. , 0 or 1). We assume that statements (1)-(3) are true for x, and we need to prove them for w=xa. Thus, assume |w|=n+1 |x| = n 1. If 1. contains q 0 for every w. 2. contains q 1 if and only if w ends in 0. 3. contains q 2 if and only if w ends in 01. , then Thus statement (1) holds. 04 December 2020 Veton Këpuska 66
Proof (Statement 2) INDUCTION: Assume w=xa, a is a symbol of alphabet (i. e. , 0 or 1). We assume that statements (1)-(3) are true for x, and we need to prove them for w=xa. Thus, assume |w|=n+1 |x| = n 1. 2. (If) Assuming that a = 0 (w ends with 0) then 2. which proves (if) part of statement (2). (only if) Assuming that we are in the state q 1, that is , then 3. clearly we can transition to that state only when the input is zero from state q 0 04 December 2020 Veton Këpuska contains q 0 for every w. contains q 1 if and only if w ends in 0. contains q 2 if and only if w ends in 01. 67
Proof (Statement 3) INDUCTION: Assume w=xa, a is a symbol of alphabet (i. e. , 0 or 1). We assume that statements (1)-(3) are true for x, and we need to prove them for w=xa. Thus, assume |w|=n+1 |x| = n 1. 3. (If) Assuming that a = 1 (w ends with 01 and thus x ends in 0) then 2. which contains q 2 & thus proves (if) part of statement (3). (only if) Assuming that we are in the state q 2, that is , then 3. clearly we can transition to that state only when the input is 1 and only from state q 1 04 December 2020 Veton Këpuska contains q 0 for every w. contains q 1 if and only if w ends in 0. contains q 2 if and only if w ends in 01. 68
An Application Example Text Search Veton Këpuska
Finding Strings in Text u u u A common problem in the age of WWW is the following: n Given a set of words, find all documents that contain one (or all) those words. n Search Engine (Google) is a popular example of this process. The search engine uses a particular technology called “inverted indexes”, where for each word appearing on the WWW a list of all the places where that work occurs is stored. n Note there are over 100, 000 different words. Inverted indexes do not use FSA’s and are somewhat difficult to use (requires large amount of disk space, large amount of time and disk space to set up). There a number of related applications that are unsuited for inverted indexes, but are good applications for automaton-based techniques. Characteristics that make an application suitable for searches with FSA are: 04 December 2020 Veton Këpuska 70
Finding Strings in Text 1. The repository on which the search is conducted is rapidly changing. a. Daily news articles. b. Current prices of items. 2. The documents to be searched cannot be cataloged. a. Amazon. com generates dynamically its pages. This information is stored in the database. b. Information from Amazon. com must be obtained by queering the system. 04 December 2020 Veton Këpuska 71
N-FSA for Text Search Problem statement: u For given a set of words – keywords, we would like to find occurrences of any of these words. u Design a N-FSA which signals by entering an accepting state, that it has seen one of the keywords. u The text is fed, one character at a time. 04 December 2020 Veton Këpuska 72
N-FSA for Text Search u The FSA can be used for recognizing (we also say accepting) strings in the following way. First, think of the input as being written on a long tape broken up into cells, with one symbol written in each cell of the tape, as figure below: 04 December 2020 Veton Këpuska 73
N-FSA for Text Search 1. There is a start state, q 0, with a transition to itself on every input symbol. 2. For each keyword a 1 a 2…ak, there are k states, say q 1 q 2…qk. n There is a transition: u from the start state, q 0, to the q 1 state for the input a 1. u from state q 1 to the q 2 state for the input a 2, etc. n State qk is an accepting state and indicates that the keyword a 1 a 2…ak has been found. 04 December 2020 Veton Këpuska 74
Recognition Process u 1. The machine starts in the start state (q 0), and iterates the following process: Check the next letter of the input. a. b. 2. If it matches the symbol on an arc leaving the current state, then i. cross that arc ii. move to the next state, also iii. advance one symbol in the input If we are in the accepting state (qf) when we run out of input, the machine has successfully recognized an instance of an input. If the machine never gets to the final state, a. b. c. either because it runs out of input, or it gets some input that doesn’t match an arc, or if it just happens to get stuck in some non-final state, we say the machine rejects or fails to accept an input. 04 December 2020 Veton Këpuska 75
Example u u u Imagine that you have become a passionate fan of woodchucks. Desiring more information on this celebrated woodland creature, you turn to your favorite Web browser and type in woodchuck. Your browser returns a few sites. You have a flash of inspiration and type in woodchucks. Instead of having to do this search twice, you would have rather typed one search command specifying something like woodchuck with an optional final s. Or perhaps you might want to search for all the prices in some document; you might want to see all strings that look like $199 or $25 or $24. 99. 04 December 2020 Veton Këpuska 76
Example u Design an N-FSA to recognize occurrences of the words “web” and “ebay”. The design follows the “recipe” given in previous slides. 04 December 2020 Veton Këpuska 77
Homework u Exercise 2. 4. 2 04 December 2020 Veton Këpuska 78
Implementation Approaches to N-FSA 1. Implement a program that simulates N-FSA by computing the set of sates it is in after reading each input symbol. 2. Convert N-FSA to an equivalent D-FSA. Then simulate the D-FSA directly. 04 December 2020 Veton Këpuska 79
Direct Implementation Approaches to N-FSA u There is a problem of (wrong) choice in non-deterministic FSA. There are three standard solutions to the problem of non-determinism: n n n Backup: Whenever we come to a choice point, we could put a marker to mark where we were in the input, and what state the automaton was in. Then if it turns out that we took the wrong choice, we could back up and try another path. Look-ahead: We could look ahead in the input to help us decide which path to take. Parallelism: Whenever we come to a choice point, we could look at every alternative path in parallel. u We will focus here on the backup approach and defer discussion of the look-ahead and parallelism approaches to later chapters. 04 December 2020 Veton Këpuska 80
Back-up Approach for N-FSA Recognizer u The backup approach suggests that we should make choices that might lead to dead-ends, knowing that we can always return to unexplored alternative choices. u There are two keys to this approach: 1. Must know ALL alternatives for each choice point. 2. Store sufficient information about each alternative so that we can return to it when necessary. 04 December 2020 Veton Këpuska 81
Back-up Approach for NFSA Recognizer u When a backup algorithm reaches a point in its processing where no progress can be made: n Runs out of input, or n Has no legal transitions, It returns to a previous choice point and selects one of the unexplored alternatives and continues from there. u To apply this notion to current definition of FSA we need only to store two things for each choice point: n The State (or node) n Corresponding position on the tape. 04 December 2020 Veton Këpuska 82
FSA With -Transitions u In order to implement the search algorithm we need to introduce another extension to FSA. u This new feature is to allow a transition on , the empty string. u This transition does not expand the class of languages that can be accepted by finite automata but does give us some added “programming convenience” 04 December 2020 Veton Këpuska 83
Uses of -Transitions u We shall begin with an informal treatment of -N-FSAs using transition diagrams with allowed as a label. u In the examples to follow, think of the automaton as accepting those sequences of labels along paths from the start state to an accepting state. However, each Transition is “invisible”; i. e. , it contributes nothing to the string along the path. 04 December 2020 Veton Këpuska 84
Uses of -Transitions u The N-FSA below has -Transitions. This N-FSA accepts decimal numbers consisting of: 1. An optional + or – sign, 2. A string of digits, 3. A decimal point “. ”, and 4. Another string of digits. Either this string of digits or the string (2) can be empty, but at least one of the two strings of digits must be non-empty. 04 December 2020 Veton Këpuska 85
Example of Keyword Spotter u N-FSA for recognizing the keywords “web” and “ebay” implemented with -Transitions. 04 December 2020 Veton Këpuska 86
Another N-FSA for “sheep” language u - transition defines the arc that couses transition without an input symbol. Thus when in state q 3 transition to state q 2 is allowed without looking at the input symbol or advancing input pointer. u This example is another kind of non-deterministic behavior – we might not know whether to follow the transition or the ! arc. 04 December 2020 Veton Këpuska 87
The Formal Notation for -NFSA u u -NFSA are represented exactly the same as N-FSA’s with one augmentation that includes the information about transitions on . An -NFSA A is represented by: A=(Q, , , q 0, F) The only difference is reflected in , that is now a function that takes as arguments: 1. A state in Q, and 2. A member of { } – an input symbol of the alphabet or the symbol . Note that the symbol empty string symbol can not be now part of the alphabet , to avoid possible confusions. 04 December 2020 Veton Këpuska 88
Example u Decimal number example -NFSA is represented formally as: E = ({q 0, q 1, …, q 5}, {. , +, -, 0, 1, …, 9}, , q 0, {q 5}) u Transition function, , is defined by the transition table: +, - . 0, 1, …, 9 q 0 {q 1} ∅ ∅ q 1 ∅ ∅ {q 2} {q 1, q 4} q 2 ∅ ∅ ∅ {q 3} q 3 {q 5} ∅ ∅ {q 3} q 4 ∅ ∅ {q 3} ∅ q 5 ∅ ∅ 04 December 2020 Veton Këpuska 89
Back-up Approach for NFSA Recognizer u When a backup algorithm reaches a point in its processing where no progress can be made: n Runs out of input, or n Has no legal transitions, It returns to a previous choice point and selects one of the unexplored alternatives and continues from there. u To apply this notion to current definition of FSA we need only to store two things for each choice point: n The State (or node) n Corresponding position on the tape. 04 December 2020 Veton Këpuska 90
Search State u Combination of the node and the position specifies the search state of the recognition algorithm. u To avoid confusion, the state of automaton is called a node or machine-state. u Two changes are necessary in transition table: 1. To represent nodes that have - transitions we need to add - column, 2. Accommodate multiple transitions to different nodes from the same input symbol. Each cell entry consists of the list of destination nodes rather then a single node. 04 December 2020 Veton Këpuska 91
Epsilon-Closures of -NFSA u Formal definitions of an extended transition functions for -NFSA’s, will be introduced next. They lead to the definition of acceptance of strings and languages by these automata. u In order to do this we need to learn a central definition called the -closure of a state. A state q is -closed by following all transitions out of q that are labeled . Those -transitions may lead to other states that also have other -transitions that must be followed, and so on, eventually finding every state that can be reached from q along any path whose arcs are all labeled . u Formally, we define the -closure ECLOSE(q) recursively. As follows: 04 December 2020 Veton Këpuska 92
Epsilon-Closures of -NFSA BASIS: u State q is in ECLOSE(q). INDUCTION: u If state p is in ECLOSE(q). And there is a transition from state p to state r labeled , then r is in ECLOSE(q). If is the transition function of the -NFSA involved, and p is in ECLOSE(q), then ECLOSE(q) also contains all the sates in (p, ) 04 December 2020 Veton Këpuska 93
Example u Each state in the above -NFSA is its own -closure, with two exceptions: n ECLOSE(q 0)={q 0, q 1} & n ECLOSE(q 3)={q 3, q 5} 04 December 2020 Veton Këpuska 94
Example ECLOSE(1)={1, 4, 2, 3, 6} 04 December 2020 Veton Këpuska 95
Extended Transitions and Languages for -NFSA u The -closure explains transitions of an -NFSA for a given sequence of (non- ) inputs. This helps defining what it means for an -NSFA to accept an input. u Suppose that u is the extended transition function. This function provides the set of states that can be reached along a path whose labels, when concatenated, for the string w. n Note that links do not contribute to w. 04 December 2020 E=(Q, , , q 0, F) is an -NFSA. Veton Këpuska 96
Recursive Definition of BASIS: u u If the label of the path is , then we can follow only -labeled arcs extending from state q; Note: that is exactly what ECLOSE(q) does. 04 December 2020 Veton Këpuska 97
Recursive Definition of INDUCTION : u Suppose string w, is of the form xa, where a is the last symbol of w. Since a is a member of ; it cannot be . u is computed as follows: 1. Let {p 1, p 2, p 3, …, pk} be. The pi’s, are all and the only states that we can reach from q following a path labeled x. This path may end with one or more transitions labeled , and may have other -labeled transitions as well. 2. Let be the set {r 1, r 2, r 3, …, rm} – reaching all the states that can be reached from q along the paths labeled x with the input a. 3. Then. Additional closure step includes all the paths from q labeled w, by considering the possibility that there additional -labeled arcs that we can follow after making a transition on the final symbol a of w. 04 December 2020 Veton Këpuska 98
Example Compute (i. e. , w=“ 5. 6”) 04 December 2020 Veton Këpuska u 99
Example 1. 2. 1. 1. 2. 04 December 2020 2. Veton Këpuska 100
Deterministic & Non. Deterministic FSAs Deterministic FSA Non-Deterministic FSA 04 December 2020 Veton Këpuska 101
Another N-FSA for “sheep” language u - transition defines the arc that cases transition without an input symbol. Thus when in state q 3 transition to state q 2 is allowed without looking at the input symbol or advancing input pointer. u This example is another kind of non-deterministic behavior – we might not know whether to follow the - transition or the ! arc. 04 December 2020 Veton Këpuska 102
Using NFSA to Accept Strings u There is a problem of (wrong) choice in non-deterministic FSA. There are three standard solutions to the problem of non-determinism: n n n Backup: Whenever we come to a choice point, we could put a marker to mark where we were in the input, and what state the automaton was in. Then if it turns out that we took the wrong choice, we could back up and try another path. Look-ahead: We could look ahead in the input to help us decide which path to take. Parallelism: Whenever we come to a choice point, we could look at every alternative path in parallel. u We will focus here on the backup approach and defer discussion of the look-ahead and parallelism approaches to later chapters. 04 December 2020 Veton Këpuska 103
Back-up Approach for NFSA Recognizer u The backup approach suggests that we should make choices that might lead to dead-ends, knowing that we can always return to unexplored alternative choices. u There are two keys to this approach: 1. Must know ALL alternatives for each choice point. 2. Store sufficient information about each alternative so that we can return to it when necessary. 04 December 2020 Veton Këpuska 104
Back-up Approach for NFSA Recognizer u When a backup algorithm reaches a point in its processing where no progress can be made: n Runs out of input, or n Has no legal transitions, It returns to a previous choice point and selects one of the unexplored alternatives and continues from there. u To apply this notion to current definition of FSA we need only to store two things for each choice point: n The State (or node) n Corresponding position on the tape. 04 December 2020 Veton Këpuska 105
Search State u Combination of the node and the position specifies the search state of the recognition algorithm. u To avoid confusion, the state of automaton is called a node or machine-state. u Two changes are necessary in transition table: 1. To represent nodes that have - transitions we need to add - column, 2. Accommodate multiple transitions to different nodes from the same input symbol. Each cell entry consists of the list of destination nodes rather then a single node. 04 December 2020 Veton Këpuska 106
The Transition table of -NFSA Input State b a ! →q 0 {q 1} Ø Ø Ø q 1 Ø {q 2} Ø Ø q 2 Ø {q 2, q 3} Ø Ø q 3 Ø Ø {q 4} {q 2} *q 4 Ø Ø 04 December 2020 Veton Këpuska 107
-NFSA Recognition Algorithm function ND-RECOGNIZE(tape, machine) returns accept or reject agenda←{(Initial state of machine, beginning of tape)} current-search-state←NEXT(agenda) loop if ACCEPT-STATE? (current-search-state) returns true then return accept else agenda← agenda ∪ GENERATE-NEW-STATES(current-search-state) if agenda is empty then return reject else current-search-state←NEXT(agenda) end 04 December 2020 Veton Këpuska 108
-NFSA Recognition Algorithm function GENERATE-NEW-STATES(current-state) returns a set of search-states current-node←the node the current search-state is in index←the point on the tape the current search-state is looking at return a list of search states from transition table as follows: (transition-table[current-node, ], index) ∪ (transition-table[current-node, tape[index]], index + 1) function ACCEPT-STATE? (search-state) returns true or false current-node←the node search-state is in index←the point on the tape search-state is looking at if index is at the end of the tape and current-node is an accept state of machine then return true else return false 04 December 2020 Veton Këpuska 109
Possible execution of NDRECOGNIZE 04 December 2020 Veton Këpuska 110
Recognition as Search u ND-RECOGNIZE accomplishes the task of recognizing strings in a regular language by providing a way to systematically explore all the psossible paths through a machine. u This kind of solutions are known as state-space search algorithms. u The key to the effectiveness of such programs is often the order which the states in the space are considered. A poor ordering of states may lead to the examination of a large number of unfruitful states before a successful solution is discovered. n n n Unfortunately typically it is not possible to tell a good choice from a bad one, and often the best we can do is to insure that each possible solution is eventually considered. Node that the ordering of states is left unspecified in NDRECONGIZE (NEXT function). Thus critical to the performance of the algorithm is the implementation of NEXT function. 04 December 2020 Veton Këpuska 111
Depth-First-Search u Depth-First-Search or Last-In-First. OUT (LIFO). u Next return the state at the front of the agenda. u Pitfall: Under certain circumstances they can enter an infinite loop. 04 December 2020 Veton Këpuska 112
Breadth-First Search u Breadth-First Search or First In First Out (FIFO) strategy. n All possible choices explored at once. u Pitfalls: As with depth-first if the statespace is infinite, the search may never terminate. More importantly due to growth in the size of the agenda if the state-space is even moderately large, the search may require an impractically large amount of memory. n For larger problems, more complex search techniques such as dynamic programming or A* must be used. 04 December 2020 Veton Këpuska 113
Possible execution of NDRECOGNIZE 04 December 2020 Veton Këpuska 114
Regular Languages and FSA u The class of languages that is definable by Regular Expressions (studied next) is exactly the same as the class of languages that are characterizable by finite-state automata: n Those languages are called Regular Languages. 04 December 2020 Veton Këpuska 115
Definition of the language L(E) of an -NFSA u -NFSA: E=(Q, , , q 0, F) u u The language of E is the set of strings w that take the start state to at least one accepting state. 04 December 2020 Veton Këpuska 116
Formal Definition of Regular Languages u u - alphabet = set of symbols in a language. - empty string Ø – empty set. The of regular languages (or regular sets) over is then formally defined as follows: 1. 2. 3. Ø is a regular language ∀a ∈ ∪ , {a} is a regular language If L 1 and L 2 are regular languages, then so are: a) L 1˙L 2 = {xy|x ∈ L 1, y ∈ L 2}, the concatenation of L 1 and L 2 b) L 1 ∪ L 2, the union or disjunction of L 1 and L 2 c) L 1*, the * closure of L 1. u All and only the sets of languages which meet the above properties are regular languages. 04 December 2020 Veton Këpuska 117
Regular Languages and FSAs u All regular languages can be implemented by the three operations which define regular languages: n Concatenation n Disjunction|Union (also called “|”), n * closure. u Example: n (*, +, {n, m}) are just a special case of repetition plus * closure. n All the anchors can be thought of as individual special symbols. n The square braces [] are a kind of disjunction: u [ab] means “a or b”, or u The disjunction of a and b. 04 December 2020 Veton Këpuska 118
Regular Languages and FSAs u Regular languages are also closed under the following operations: n Intersection: if L 1 and L 2 are regular languages, then so is L 1 ∩ L 2, the language consisting of the set of strings that are in both L 1 and L 2. n Difference: if L 1 and L 2 are regular languages, then so is L 1 – L 2, the language consisting of the set of strings that are in L 1 but not L 2. n Complementation: if L 1 and L 2 are regular languages, then so is *-L 1, the set of all possible strings that are not in L 1. n Reversal: if L 1 is regular language, then so is L 1 R, the language consisting of the set of reversals of the strings that are in L 1. 04 December 2020 Veton Këpuska 119
Regular Expressions and FSA u The regular expressions are equivalent to finite-state automaton (Proof: Hopcroft and Ullman 1979). u Proof is inductive. Each primitive operations of a regular expression (concatenation, union, closure) is shown as part of inductive step of the proof: 04 December 2020 Veton Këpuska 120
Concatenation u FSAs next to each other by connecting all the final states of FSA 1 to the initial state of FSA 2 by an -transition 04 December 2020 Veton Këpuska 121
Closure u Repetition: All final states of the FSA back to the initial states by -transition u Zero occurrences case: Direct link from the initial state to final state 04 December 2020 Veton Këpuska 122
Union u Add a single new initial state q 0, and add new transitions from it to the former initial states of the two machines to be joined 04 December 2020 Veton Këpuska 123
Summary u This chapter introduced the most important fundamental concept of the finite automaton, and equivalent regular language. Here’s a summary of the main points we covered about these ideas: u Any regular language can be realized as a finite state automaton (FSA). Thus, an automaton implicitly defines a formal language as the set of strings the automaton accepts. u An automaton can use any set of symbols for its vocabulary, including letters, words, or even graphic images. 04 December 2020 Veton Këpuska 124
Summary u The behavior of a deterministic automaton (D-FSA) is fully determined by the state it is in. u A non-deterministic automaton (N-FSA) sometimes has to make a choice between multiple paths to take given the same current state and next input. u Any N-FSA can be converted to a D-FSA. u The order in which a N-FSA chooses the next state to explore on the agenda defines its search strategy. n The depth-first search or LIFO strategy corresponds to the agenda-as-stack; n The breadth-first search or FIFO strategy corresponds to the agenda-as-queue. n A* (Dynamic Programming Algorithm) Search. u Any regular language can be automatically compiled into a N-FSA and hence into FSA 04 December 2020 Veton Këpuska 125
End Veton Këpuska
- Which grammar generates regular language?
- Contoh soal teori bahasa dan automata
- Finite automata dikelompokkan menjadi
- Finite state automata didefinisikan dengan?
- Buatlah mesin moore dalam penentuan output 15 mod 2
- Aturan produksi finite state automata
- Deterministic finite state automata
- Contoh kasus finite state automata
- State machines digital electronics
- Finite subordinate clause
- Finite verb
- Learning objectives for finite and non finite verbs
- How to find finite and nonfinite verbs
- Non finite forms of the verb qayda
- String matching finite automata
- Automata
- Lexical analysis finite automata
- Finite state diagram generator
- Finite automata tutorial
- Kleene theorem part 3
- Finite automata calculator
- Lambda closure nfa
- Informal picture of finite automata
- Limitations of finite automata
- Finite automata
- Finite automata
- Unempty
- Deterministic finite automaton
- Finite automata with epsilon transitions
- Automata theory tutorial
- Automata theory tutorial
- Csc3120 datasheet
- Nondeterministic means choice of moves for automata *
- Formal languages and automata theory tutorial
- Formal language
- Pumping lemma non regular languages examples
- Decision properties of regular languages
- Decision properties of regular languages
- Closure under intersection
- Decision properties of regular languages
- Decision properties of regular languages
- Properties of regular languages
- Regular and irregular languages
- Pumping lemma for regular languages
- Right linear grammar
- Tcp header
- Finite state machine sequential circuits
- Finite state machine with datapath
- Finite state machine minimization
- Vhdl finite state machine
- Traffic light finite state machine
- Tcp connection management finite state machine
- Fsm
- Finite state machine elevator
- Finite state machine vending machine example
- In time domain analysis, finite steady state error is
- Finite state machine
- Ospf finite state machine
- Finite state machine
- Finite state machine game
- Lock logic
- Finite state machine vhdl testbench
- Cis4914
- Finite state machine
- Sequential state machine
- What language do akwa ibom speak
- In an incompletely specified machine
- Unit 7 lesson 1 midsegments of triangles
- Automata chapter 1
- Chapter 14 work power and machines
- Chapter 10 energy, work and simple machines answer key
- Chapter 14 work power and machines
- Section 4 review physical science
- Chapter 10 energy, work and simple machines answer key
- Energy work and simple machines chapter 10 answers
- Inclined plane family
- Chapter 14 section 1 work and power
- Physics 10
- Konsep warga digital
- Digital market and digital goods
- Digital data digital signals
- Data encoding and transmission
- E-commerce: digital markets, digital goods
- Signal encoding schemes
- Healthtech ecosystem
- Unique features of digital markets
- Properties of liquid state of matter
- State to state regionalism
- S r flip flop excitation table
- Good state and bad state graphs
- Svjetlana kalanj bognar
- Orbital vs subshell
- T and r state of hemoglobin
- Absorptive state vs postabsorptive state
- Synthesis of glycogen
- Age of consent state by state
- Current state vs future state diagram
- Equivalent state
- New state drink
- State graphs in software testing
- What is initial state + goal state in search terminology?
- Tasscc state of the state
- Behavioral state machine diagram
- Connecticut state comptroller
- Unit 10 sequences and series homework 2 answers
- The sum of a finite geometric sequence is called
- Electric field of a finite line charge
- Sum of a finite arithmetic series
- Cochran formula for finite population
- Analysis
- Finite power series
- Operatii cu fractii
- What is the difference between finite and infinite sequence
- Finite difference equation
- Finite divided difference
- Metoda elementelor finite
- Magnetic field of a finite wire
- Finite verb latin
- The potential energy outside the box is considered to be
- What is finite loading
- Finite differences
- Finite difference
- Finite capacity scheduling example
- Electric field of a finite line charge
- Sample complexity for finite hypothesis spaces
- Finite potential well
- Venn diagram finite math