Digital State Machines Finite Automata Regular Languages Chapter

  • Slides: 126
Download presentation
Digital State Machines Finite Automata & Regular Languages

Digital State Machines Finite Automata & Regular Languages

Chapter Outline u u Introduction Finite-State Automata Regular Languages and Finite-State Automata Summary 04

Chapter Outline u u Introduction Finite-State Automata Regular Languages and Finite-State Automata Summary 04 December 2020 Veton Këpuska 2

Introduction: Finite State Automata u Finite-state automaton is one of the most significant tools

Introduction: Finite State Automata u Finite-state automaton is one of the most significant tools of computational linguistics. Its variations: n Finite-state transducers n Hidden Markov Models, and n N-grammars Important components of the Speech Recognition and Synthesis, spellchecking, and information-extraction applications. u The FSA theory was designed in the beginning of computer science as a model of abstract computing machines pioneered by the work Allan Turing. n n FSA’s are devices that accept-recognize or reject an input stream of characters. FSA’s are very efficient in term of speed and memory The most frequent usage of Finite-State Automata is searching words or phrases. Additional uses in application areas such as: u u u Morphological parsing, Parts of speech annotation, and Speech Processing and Recognition. 04 December 2020 Veton Këpuska 3

Example of Finite State Automata u This FSA accepts (recognizes) or generates strings like:

Example of Finite State Automata u This FSA accepts (recognizes) or generates strings like: n n n ac abbc abbbc, abbbbbc, etc. 04 December 2020 Veton Këpuska 4

Introduction: D-FSA vs. ND-FSA u Adding non-determinism to FSA will not allow us define

Introduction: D-FSA vs. ND-FSA u Adding non-determinism to FSA will not allow us define any language that can not be defined by deterministic FSAs. u Why then bother with ND-FSAs: n n n It turns out that there can be substantial efficiency in describing an application using ND-FSAs allows us to program solutions to problems using a higher -level language. This program then is compiled, by the algorithm (that we will learn in this chapter), into a deterministic FSA that can be executed on a conventional computer. 04 December 2020 Veton Këpuska 5

Finite State Automata An Informal Description of Finite State Automata

Finite State Automata An Informal Description of Finite State Automata

Finite Automata u u u Study extended example of a real-world problem whose solution

Finite Automata u u u Study extended example of a real-world problem whose solution uses finite automata. Investigate protocols that support “electronic money” – files that: n a customer can use to pay for goods on the internet, retains a copy of the same file to spend again, and n a seller can receive with assurance that “money” is real. It must know that the file has not been forged, nor has it been copied and sent to the seller. Nonforgeability of the file must guaranteed by a third party – a bank and by a cryptography policy. n Encryption of the money files ensures that forgery is not a problem. n Bank must also keep a database of al the valid money that it has issued: u It can verify to a store that the file it has recived represents real money and can be credited to the store’s account. n Encryption is not going to be addressed as it is beyond the scope of the topic covered in this class. 04 December 2020 Veton Këpuska 7

Finite Automata u Nevertheless, in order to use electronic money, protocols need to be

Finite Automata u Nevertheless, in order to use electronic money, protocols need to be devised to allow the manipulation of the money in a variety of ways that the users want. n Monetary systems always invite fraud, and the protocol must verify whatever policy is adopted regarding home money is used. n The solution needs to ensure that the only things that can happen are things we intend to happen: an unscrupulous user will not be allowed to steal from others or to “manufacture” money. 04 December 2020 Veton Këpuska 8

The Ground Rules u u u The participants: n The customer n The store

The Ground Rules u u u The participants: n The customer n The store n The bank Only one money file in existence (for simplicity) The customer: n Pay, which initiates transfer of “his” money file to the store, or n Cancel the transfer, effectively asking the bank to place the money back in the customer’s account. The store: n Ship goods to the customer, n Redeem the money, effectively asking the bank to transfer the money to the store’s account. The bank: n Transfer the money by creating a new, suitable encrypted money file and sending it to the store. 04 December 2020 Veton Këpuska 9

The Protocol u u u The customer – n Assume that the customer can

The Protocol u u u The customer – n Assume that the customer can not be relied to act responsibly. u Customer may try to copy the money file, u Use the same money file to pay several times, or both The bank – n Assuming that the bank must behave responsibly, or it can not be a bank. u It must ensure that tow stores cannot both redeem the same money file, u It will not allow money file to be both canceled and redeemed. The store – n Will not ship goods until it is sure it has been given valid money. 04 December 2020 Veton Këpuska 10

The Protocol u FSA can represents the protocols as the one being discussed. n

The Protocol u FSA can represents the protocols as the one being discussed. n States – will represent each possible “state”/situation that each participants could be in. u The state remembers important events that have happened, u Also it knows which ones did not yet happen. n Transitions – occur between states whone of the five events described previously occur. 04 December 2020 Veton Këpuska 11

FSAs for Money Transfer Example Bank: n Beginning State is state “ 1” u

FSAs for Money Transfer Example Bank: n Beginning State is state “ 1” u u n Cancel request u u n u Veton Këpuska Bank restores the money and enders state 2. Bank can not leave state 2 since it can not allow the same money to be canceled again or to be spent by the customer. Redeem request u u 04 December 2020 The bank has issues a money file No requests have been made to either redeem it or cancel it. Enters state 3, and Initiates transfer and upon completion enters state 4. In state 4 it will no longer accept cancel, nor redeem requests, nor will it perform any other transactions regarding this particular money file. 12

FSAs for Money Transfer Example Store: Procedures in the store assumed to be imperfect.

FSAs for Money Transfer Example Store: Procedures in the store assumed to be imperfect. n n Beginning State is state “a” Pay request u u n Customer orders the goods by performing pay action. Enters state “b” and initiates both shipping and redemption process. Ship and Redeem request u u Enters state c or d in any order, and Initiates redeem /transfer or ship and enters state e/f or e. Customer: n Pay and Cancel request u 04 December 2020 Veton Këpuska Can do them any number of times and in any order. 13

Enabling Automata to Ignore Actions Missing transitions: u Store is not affected by a

Enabling Automata to Ignore Actions Missing transitions: u Store is not affected by a “cancel” action. n n u Effects of unexpected actions: n n u According to the formal definition of FSA (next) whenever an input X is received by an automaton, the automaton must follow an arc labeled X from the state that it is in to a new state. Store FSA must me augmented with transitions that correspond to “cancel” actions. Customer executed “pay” action second time, while store is in state e. Since store automaton does not have an arc corresponding to pay action in that state it will case FSA to “die”. The two kinds of actions that must be ignored by FSA’s: 1. Actions that are irrelevant to the participant involved. u u u 2. For the store FSA : “cancel” action. For the bank FSA: “pay” and “ship” For the customer FSA: “ship”, “redeem” and “transfer” Actions that must not be allowed to kill an automaton. u u For the store FSA: customers second “pay”, or “cancel” actions should not be allowed to kill its FSA. For the bank FSA: stores multiple “redeem” actions should be ignored. 04 December 2020 Veton Këpuska 14

Completed FSA’s 04 December 2020 Veton Këpuska 15

Completed FSA’s 04 December 2020 Veton Këpuska 15

Complete System as FSA u Previous models accounted actions of each participants independently. n

Complete System as FSA u Previous models accounted actions of each participants independently. n n u Customer’s FSA is simple – no-matter what actions are taken it resides in the same state. Bank’s and Store’s FSAs are complex and it is not immediately obvious in what combinations of states these tow automata can be. Product Automaton: n n n The normal way to explore the interaction of automata is to construct product automaton. New product FSA states are composed of pairs of states from each original FSAs: (3, d) – state denotes the situation where the bank is in state 2 and store in state d. Bank = 4 states, Store = 7 states, Product FSA = 4 x 7=28 states 04 December 2020 Veton Këpuska 16

Product Automaton for the Store and Bank 04 December 2020 Veton Këpuska 17

Product Automaton for the Store and Bank 04 December 2020 Veton Këpuska 17

Product Automaton u u u Each of the two component of the product automaton

Product Automaton u u u Each of the two component of the product automaton independently makes transitions on the various inputs. If an input action is received, and one fo the two automata has no sate to go to on that input, then the product automaton “dies”; it has no state to go to. Formal Rule: n n n Assume (bank, store) product automaton being in state (i, x). Let Z be one of the input actions. Observe if there is a transition from state i under the input Z. Suppose there is a transition to state j. Similarly, observe if there is a transition from state x under the same input Z to state y. Thus, there is a transition from (i, x) to state (j, y) under input Z. If any of the states j or y do not exist than there is not transition arc labeled Z from (i, x). Example: u Consider the input redeem. If bank receives a redeem message when in state 1, it goes to state 3. If it in state 3 or 4 it stays there. If in state 2 the bank automaton dies. 04 December 2020 Veton Këpuska 18

Using Product Automaton to validate the Protocol u u u Only 10 states are

Using Product Automaton to validate the Protocol u u u Only 10 states are accessible from start state Example of states that are not accessible. Real purpose of analyzing a protocol such as this one using automata is to ask and answer questions that mean: “Can the following type of error occur? ” Example: “Is it possible that the store can ship goods and never get paid? ” State is c, e, or g and no transition on input T was ever made? Problem State (2, c) 04 December 2020 Veton Këpuska 1 3 5 2 4 6 ? 7 8 9 10 19

Deterministic Finite State Automaton Formalism of a Deterministic Finite State Automaton Veton Këpuska

Deterministic Finite State Automaton Formalism of a Deterministic Finite State Automaton Veton Këpuska

Deterministic Finite State Automaton u “Deterministic” refers to the fact that on each input

Deterministic Finite State Automaton u “Deterministic” refers to the fact that on each input there is one and only one state to which the automaton can transition from its current state. u Non-deterministic automaton can transition from its present state to more than one states on the same input. 04 December 2020 Veton Këpuska 21

Definition of D-FSA u A deterministic Finite State Automaton consists of: 1. A finite

Definition of D-FSA u A deterministic Finite State Automaton consists of: 1. A finite set of states – Q 2. A finite set of input symbols, 3. A transition function, , that takes as arguments: u a state, and u an input symbol, and u returns a state : 4. A start state, q 0, one of the states in Q 5. A set of final, or accepting, states F. F Q u Five-tuple notation of a D-FSA named A: A=(Q, , , q 0, F) 04 December 2020 Veton Këpuska 22

Formal Definition of Automaton Q={q 0, q 1, …, q. N} A finite set

Formal Definition of Automaton Q={q 0, q 1, …, q. N} A finite set of N states a finite input alphabet of symbols q 0 the start state F the set of final states, F ⊆ Q δ(q, i) 04 December 2020 the transition function or transition matrix between states. Given a state q ∈ Q and an input symbol i ∈ , δ(q, i) returns a new state q′ ∈ Q. δ is thus a relation from Q×S to Q; Veton Këpuska 23

String Processing with D-FSA u u Suppose a 1 a 2…an is a sequence

String Processing with D-FSA u u Suppose a 1 a 2…an is a sequence of inputs symbols. Initial state of D-FSA is its start state q 0, then 1. q 1= (q 0, a 1) 2. q 2= (q 1, a 2) … i. qi= (qi-1, ai) … n. qn= (qn-1, an) u If qn F then the input a 1 a 2…an sequence “accepted” otherwise it is “rejected”. 04 December 2020 Veton Këpuska 24

D-FSA Example u Using FSA to Recognize Sheeptalk “baa…!” 04 December 2020 Veton Këpuska

D-FSA Example u Using FSA to Recognize Sheeptalk “baa…!” 04 December 2020 Veton Këpuska 25

FSA Use u The FSA can be used for recognizing (we also say accepting)

FSA Use u The FSA can be used for recognizing (we also say accepting) strings in the following way. First, think of the input as being written on a long tape broken up into cells, with one symbol written in each cell of the tape, as figure below: 04 December 2020 Veton Këpuska 26

Recognition Process u 1. The machine starts in the start state (q 0), and

Recognition Process u 1. The machine starts in the start state (q 0), and iterates the following process: Check the next letter of the input. a. b. 2. If it matches the symbol on an arc leaving the current state, then i. cross that arc ii. move to the next state, also iii. advance one symbol in the input If we are in the accepting state (q 4) when we run out of input, the machine has successfully recognized an instance of sheeptalk. If the machine never gets to the final state, a. b. c. either because it runs out of input, or it gets some input that doesn’t match an arc (as in Fig in previous slide), or if it just happens to get stuck in some non-final state, we say the machine rejects or fails to accept an input. 04 December 2020 Veton Këpuska 27

FSA For “Shpeep. Talk” Example u Q = {q 0, q 1, q 2,

FSA For “Shpeep. Talk” Example u Q = {q 0, q 1, q 2, q 3, q 4}, u = {a, b, !}, // Sheep Language u F = {q 4}, and u δ(q, i) // Defined in next slide 04 December 2020 Veton Këpuska 28

State Transition Table Input State b a ! → 0 1 Ø Ø 1

State Transition Table Input State b a ! → 0 1 Ø Ø 1 Ø 2 Ø 3 Ø 3 4 *4 Ø Ø Ø We’ve marked state 4 with a * to indicate that it’s a final/accepting state (you can have as many final states as you want), and the Ø indicates an illegal or missing transition. We can read the first row as “if we’re in state 0 and we see the input b we must go to state 1. If we’re in state 0 and we see the input a or !, we fail”. 04 December 2020 Veton Këpuska 29

Deterministic Algorithm for Recognizing a String function D-RECOGNIZE(tape, machine) returns accept or reject index←Beginning

Deterministic Algorithm for Recognizing a String function D-RECOGNIZE(tape, machine) returns accept or reject index←Beginning of tape current-state←Initial state of machine loop if End of input has been reached then if current-state is an accept state then return accept else return reject elsif transition-table[current-state, tape[index]] is empty then return reject else current-state←transition-table[current-state, tape[index]] index←index + 1 end 04 December 2020 Veton Këpuska 30

Tracing Execution for Some Sheep Talk Before examining the beginning of the tape, the

Tracing Execution for Some Sheep Talk Before examining the beginning of the tape, the machine is in state q 0. Finding a b on input tape, it changes to state q 1 as indicated by the contents of transition-table[q 0, b] in Fig. It then finds an a and switches to state q 2, another a puts it in state q 3, a third a leaves it in state q 3, where it reads the “!”, and switches to state q 4. Since there is no more input, the End of input condition at the beginning of the loop is satisfied for the first time and the machine halts in q 4. State q 4 is an accepting state, and so the machine has accepted the string baaa! as a sentence in the sheep language. 04 December 2020 Veton Këpuska 31

Fail State u The algorithm will fail whenever there is no legal transition for

Fail State u The algorithm will fail whenever there is no legal transition for a given combination of state and input. The input abc will fail to be recognized since there is no legal transition out of state q 0 on the input a, (i. e. , this entry of the transition table has a Ø). u Even if the automaton had allowed an initial a it would have certainly failed on c, since c isn’t even in the sheeptalk alphabet! We can think of these “empty” elements in the table as if they all pointed at one “empty” state, which we might call the fail state or sink state. u In a sense then, we could FAIL STATE view any machine with empty transitions as if we had augmented it with a fail state, and drawn in all the extra arcs, so we always had somewhere to go from any state on any possible input. Just for completeness, next Fig. shows the FSA from previous Figure with the fail state q. F filled in. 04 December 2020 Veton Këpuska 32

Adding a Fail State to FSA 04 December 2020 Veton Këpuska 33

Adding a Fail State to FSA 04 December 2020 Veton Këpuska 33

Example u Suppose we have a D-FSA that accepts all and only the strings

Example u Suppose we have a D-FSA that accepts all and only the strings of 0’s and 1’s that have the sequence 01 somewhere in the string. We can write this language L as follows: {w|w is of the form x 01 y for some strings x and y consisting of 0’s and 1’s} u Equivalent description is: {x 01 y | x and y are any strings of 0’s and 1’s} u Example strings in this language L include 01, 110110, 100011. u Example strings not in this language L are ∊, 0, and 111000. 04 December 2020 Veton Këpuska 34

Example u u u What can be said about this D-FSA (A) that accepts

Example u u u What can be said about this D-FSA (A) that accepts this language L. = {0, 1} It has a number (of yet unknown) set of states with one of them say q 0 a starting state. It has to remember some important facts about what inputs it has seen so far. This is necessary to decide whether 01 is a substring of the input. A needs to remember: 1. 2. 3. Has it already seen 01? If yes than it will be in accepting state from now on. Has not seen 01, but its most recent input was 0, thus if now sees a 1, it will have seen 01 and can accept everything it sees from here on? Has not seen 01, but its last input was either nonexistent (it just started) or it has saw a 1? In this case A cannot accept until it first sees a 0 and then sees a 1 immediately after. 04 December 2020 Veton Këpuska 35

Example u Each condition presented in previous slide can be represented by a state.

Example u Each condition presented in previous slide can be represented by a state. n Condition (3) is represented by the start (first) state q 0: 1 q 0 n If we are in the q 0 state, and next input is “ 0” we are then governed by condition (2): 1 q 0 04 December 2020 0 0 Veton Këpuska q 2 36

Example u u If we are in the state (2) and we receive input

Example u u If we are in the state (2) and we receive input “ 1” – FSA should transit to the accepting state, which in case we choose to name it state q 1. Finally in accepting state q 1 any combination of 0’s and 1’s should not change the state. Thus Q = {q 0, q 1, q 2} and F={q 1} 1 q 0 0 0 1, 0 q 2 1 q 1 A=({q 0, q 1, q 2}, {0, 1}, , q 0, {q 1}) 04 December 2020 Veton Këpuska 37

Simpler Notations for D-FSA u A five-tuple with a detailed description of the transitions

Simpler Notations for D-FSA u A five-tuple with a detailed description of the transitions is both tedious and hard to read. There are two preferred notations: 1. A transition diagram, which is a graph such as the ones we have seen previously. 2. A transition table, which is a tabular listing of the function, which provides the set of states and the input alphabet. 04 December 2020 Veton Këpuska 38

Transition Diagrams u A transition diagram for a FSA A=(Q, , , q 0,

Transition Diagrams u A transition diagram for a FSA A=(Q, , , q 0, F) is a graph defined as follows: 1. For each state in Q there is a node 2. For each state q in Q and each input symbol a in , let (q, a)=p. The transition diagram has an arc from node q to node p, labeled a. If there are several input symbols that cause transitions from q to p, then the transition diagram can have one arc, labeled by the list of these symbols. 3. There is an arrow into the start state q 0, labeled Start. 4. Nodes corresponding to accepting states (set F) are marked with double circle. 04 December 2020 Veton Këpuska 39

Example A=(Q, , , q 0, F) A=({q 0, q 1, q 2}, {0,

Example A=(Q, , , q 0, F) A=({q 0, q 1, q 2}, {0, 1}, , q 0, {q 1}) 04 December 2020 Veton Këpuska 40

Transition Tables u Transition table is a conventional, tabular representation of a function like

Transition Tables u Transition table is a conventional, tabular representation of a function like that takes two arguments and returns a value. n Rows – correspond to states n Columns – correspond to inputs Input 0 1 → q 0 q 2 q 0 *q 1 q 1 q 2 q 1 State Transition table for the DFSA of previous example 04 December 2020 Veton Këpuska 41

Extending the Transition Function to Strings u D-FSA defines a language: n The set

Extending the Transition Function to Strings u D-FSA defines a language: n The set of all strings that result in a sequence of state transitions from the start state to an accepting state, or alternatively n The set of labels along all the paths that lead from the start state to any accepting state - in terms of the transition diagram. u Formulate precisely the notation of the language expressed by D-FSA: n n Define extended transition function of It describes what happens when we start in any state and follow any sequence of inputs. 04 December 2020 Veton Këpuska 42

Definition of Extended Transition Function BASIS: u If we are in state q and

Definition of Extended Transition Function BASIS: u If we are in state q and read no inputs, then we are still in state q. INDUCTION: u Suppose w is a string of the form xa; n w = 1101 x = 110 & a = 1 04 December 2020 Veton Këpuska 43

Example u u Design D-FSA to accept the language: L={w|w has both an even

Example u u Design D-FSA to accept the language: L={w|w has both an even number of 0’s and 1’s} Solution: n Use states to count how many 0’s and 1’s has seen. Since even number requires counting modulo 2 we need to have 2 states for each symbol of the alphabet total of 4. n = {0, 1} n Q = {q 0, q 1, q 2, q 3} n q 0 – both number of 0’s and 1’s seen so far is even Accepting State; F = {q 0} n q 1 – number of 0’s is even and number of 1’s seen so far is odd n q 2 – number of 0’s is odd and number of 1’s seen so far is even n q 3 – number of 0’s and 1’s seen so far is odd 04 December 2020 Veton Këpuska 44

Transition Diagram of D-FSA 04 December 2020 Veton Këpuska 45

Transition Diagram of D-FSA 04 December 2020 Veton Këpuska 45

Transition Table Input 0 1 *→ q 0 q 2 q 1 q 0

Transition Table Input 0 1 *→ q 0 q 2 q 1 q 0 q 2 q 0 q 3 q 1 q 2 State 04 December 2020 Veton Këpuska 46

Test u The check involves computing for an input, say w=110101 starting from ∊.

Test u The check involves computing for an input, say w=110101 starting from ∊. 04 December 2020 Veton Këpuska 47

Formal Languages u Key Concept #1. Formal Language: n A model which can both

Formal Languages u Key Concept #1. Formal Language: n A model which can both generate and recognize all and only the strings of a formal language acts as a definition of the formal language. n A formal language is a set of strings, each string composed of symbols from a finite symbol-set called an alphabet (the same alphabet used above for defining an automaton!). u The alphabet for a “sheep” language is the set = {a, b, !}. u Given a model m (such as FSA) we can use L(m) to mean “the formal language characterized by m”. u L(m)={baa!, baaaa!, baaaaa!, …. } 04 December 2020 Veton Këpuska 48

The Formal Language Defined by D-FSA u Language defined by D-FSA A=(Q, , ,

The Formal Language Defined by D-FSA u Language defined by D-FSA A=(Q, , , q 0, F), denoted as L(A) is defined as: u That is, the language A is the set of strings w that take the start state q 0 to one of the accepting states of D-FSA. If L is L(A) for a D-FSA, then we say L is a regular language. 04 December 2020 Veton Këpuska 49

Homework: u 2. 2. 1, 2. 2. 2, 2. 2. 3, 2. 2. 4,

Homework: u 2. 2. 1, 2. 2. 2, 2. 2. 3, 2. 2. 4, 2. 2. 5, 2. 2. 6, 2. 2. 7, 2. 2. 8, 2. 2. 9, 2. 2. 10 04 December 2020 Veton Këpuska 50

Nondeterministic Finite State Automaton Veton Këpuska

Nondeterministic Finite State Automaton Veton Këpuska

An Informal View of N-FSA u u N-FSA has n a finite set of

An Informal View of N-FSA u u N-FSA has n a finite set of states, n a finite set of input symbols, n one start state, n one or more accepting states. N-FSA is different from D-FSA in the type of : n It takes a state and input symbol as arguments and returns a set of zero, one, or more states. 04 December 2020 Veton Këpuska 52

Deterministic vs Non. Deterministic FSAs: Example Deterministic FSA Non-Deterministic FSA 04 December 2020 Veton

Deterministic vs Non. Deterministic FSAs: Example Deterministic FSA Non-Deterministic FSA 04 December 2020 Veton Këpuska 53

Deterministic vs Non-deterministic FSA u Deterministic FSA is one whose behavior during recognition is

Deterministic vs Non-deterministic FSA u Deterministic FSA is one whose behavior during recognition is fully determined by the state it is in and the symbol it is looking at. u The FSA in the previous slide when FSA is at the state q 2 and the input symbol is a we do not know whether to remain in state 2 (self-loop transition) or state 3. Clearly the decision dependents on the next input symbols. 04 December 2020 Veton Këpuska 54

Non-Deterministic FSA (N-FSA) u N-FSA power is in its ability to possibly be in

Non-Deterministic FSA (N-FSA) u N-FSA power is in its ability to possibly be in more than one state at the same time. n Useful in string matching were N-FSA would “guess” if it is seeing a beginning of a string. n Example of this introduced later. u Define nondeterministic FSA and show that it accepts a language that is also accepted by a DFSA. n n n N-FSA’s accept regular languages as D-FSA’s N-FSA’s are more succinct and are easier to design N-FSA’s can be converted to D-FSA’s 04 December 2020 Veton Këpuska 55

N-FSA Example u The figure below depicts a N-FSA. It accepts all and only

N-FSA Example u The figure below depicts a N-FSA. It accepts all and only the strings of 0’s and 1’s that end in “ 01”. u On “ 0” in q 0 state the N-FSA can transition to both q 0 and q 1 (in anticipation of next “ 1”). u Example of state sequence for the input 00101 is presented in the next slide. 04 December 2020 Veton Këpuska 56

State Transitions for 00101 04 December 2020 Veton Këpuska 57

State Transitions for 00101 04 December 2020 Veton Këpuska 57

Definition of Nondeterministic Finite State Automaton N-FSA: A=(Q, , , q 0, F) 1.

Definition of Nondeterministic Finite State Automaton N-FSA: A=(Q, , , q 0, F) 1. 2. 3. 4. 5. Q – a finite set of states – a finite set of input symbols q 0 – start state, a member of set Q F – a set of final/accepting states, a subset of Q - transition function that takes a state in Q and an input symbol in as arguments and returns a set of states; a subset of Q. 04 December 2020 Veton Këpuska 58

N-FSA Definition Input 0 1 {q 0, q 1} {q 0} q 1 ∅

N-FSA Definition Input 0 1 {q 0, q 1} {q 0} q 1 ∅ {q 2} * q 2 ∅ ∅ State → q 0 04 December 2020 Veton Këpuska 59

The Extended Transition Function for N-FSA u Similarly to D-FSA’s we need to extend

The Extended Transition Function for N-FSA u Similarly to D-FSA’s we need to extend the transition function of an N-FSA to function that takes a state q and a string of input symbols w, and returns the set of sates that the N-FSA is in; if it starts in the state q and processes the string w. u Slide State Transitions for 00101 depicts the function 04 December 2020 Veton Këpuska 60

Definition of Extended Transition Function BASIS: u If we are in state q and

Definition of Extended Transition Function BASIS: u If we are in state q and read no inputs, then we are still in state q. INDUCTION: u Suppose w is a string of the form xa, where a is a final symbol of w and x is the rest. Also Let Then 04 December 2020 Veton Këpuska 61

Example u Decreibe the processing of N-FSA for w=00101 starting from ∊. 04 December

Example u Decreibe the processing of N-FSA for w=00101 starting from ∊. 04 December 2020 Veton Këpuska 62

The Language of an N-FSA u N-FSA will accept a string w if it

The Language of an N-FSA u N-FSA will accept a string w if it is possible to make any sequence of choices of next state, while reading the symbols of w, and go from start state to any of accepting states. n The fact that other choices on the same input w lead to a non-accepting state or do not lead to any state at all (i. e. , the sequence of states “dies”), does not prevent w from being accepted by the N-FSA as a whole. u Formally: If A=(Q, , , q 0, F) is an N-FSA, then That is, L(A) is the set of string w in * such that contains at least one accepting state 04 December 2020 Veton Këpuska 63

Example u u u Formally prove that the N-FSA of the previous example accepts

Example u u u Formally prove that the N-FSA of the previous example accepts the language L={w|w ends in 01}. Proof is a mutual induction of the following three statements that characterize three states: 1. contains q 0 for every w. 2. contains q 1 if and only if w ends in 0. 3. contains q 2 if and only if w ends in 01. To prove the above statements, we need to consider how A (NFSA), can reach state; i. e. , n n u what was the last input symbols, and what state was A just before reading that symbol. The proof of theorem is an induction on | w |, the length of w starting from length 0. 04 December 2020 Veton Këpuska 64

Proof (Basis) BASIS: If |w| = 0, then w = ∊. u Statement (1)

Proof (Basis) BASIS: If |w| = 0, then w = ∊. u Statement (1) follows from the basis definition: 1. u Statement (2) is true because: n n u n n 2. contains q 1 if and only if w ends in 0. 3. contains q 2 if and only if w ends in 01. ∊ is empty set and it does not end with 0, and from definition above it does not contain q 1 Statement (3) is true because: ∊ is empty set and it does not end with 01, and from basis definition above it does not contain q 1 04 December 2020 Veton Këpuska contains q 0 for every w. 65

Proof (Statement 1) INDUCTION: Assume w=xa, a is a symbol of alphabet (i. e.

Proof (Statement 1) INDUCTION: Assume w=xa, a is a symbol of alphabet (i. e. , 0 or 1). We assume that statements (1)-(3) are true for x, and we need to prove them for w=xa. Thus, assume |w|=n+1 |x| = n 1. If 1. contains q 0 for every w. 2. contains q 1 if and only if w ends in 0. 3. contains q 2 if and only if w ends in 01. , then Thus statement (1) holds. 04 December 2020 Veton Këpuska 66

Proof (Statement 2) INDUCTION: Assume w=xa, a is a symbol of alphabet (i. e.

Proof (Statement 2) INDUCTION: Assume w=xa, a is a symbol of alphabet (i. e. , 0 or 1). We assume that statements (1)-(3) are true for x, and we need to prove them for w=xa. Thus, assume |w|=n+1 |x| = n 1. 2. (If) Assuming that a = 0 (w ends with 0) then 2. which proves (if) part of statement (2). (only if) Assuming that we are in the state q 1, that is , then 3. clearly we can transition to that state only when the input is zero from state q 0 04 December 2020 Veton Këpuska contains q 0 for every w. contains q 1 if and only if w ends in 0. contains q 2 if and only if w ends in 01. 67

Proof (Statement 3) INDUCTION: Assume w=xa, a is a symbol of alphabet (i. e.

Proof (Statement 3) INDUCTION: Assume w=xa, a is a symbol of alphabet (i. e. , 0 or 1). We assume that statements (1)-(3) are true for x, and we need to prove them for w=xa. Thus, assume |w|=n+1 |x| = n 1. 3. (If) Assuming that a = 1 (w ends with 01 and thus x ends in 0) then 2. which contains q 2 & thus proves (if) part of statement (3). (only if) Assuming that we are in the state q 2, that is , then 3. clearly we can transition to that state only when the input is 1 and only from state q 1 04 December 2020 Veton Këpuska contains q 0 for every w. contains q 1 if and only if w ends in 0. contains q 2 if and only if w ends in 01. 68

An Application Example Text Search Veton Këpuska

An Application Example Text Search Veton Këpuska

Finding Strings in Text u u u A common problem in the age of

Finding Strings in Text u u u A common problem in the age of WWW is the following: n Given a set of words, find all documents that contain one (or all) those words. n Search Engine (Google) is a popular example of this process. The search engine uses a particular technology called “inverted indexes”, where for each word appearing on the WWW a list of all the places where that work occurs is stored. n Note there are over 100, 000 different words. Inverted indexes do not use FSA’s and are somewhat difficult to use (requires large amount of disk space, large amount of time and disk space to set up). There a number of related applications that are unsuited for inverted indexes, but are good applications for automaton-based techniques. Characteristics that make an application suitable for searches with FSA are: 04 December 2020 Veton Këpuska 70

Finding Strings in Text 1. The repository on which the search is conducted is

Finding Strings in Text 1. The repository on which the search is conducted is rapidly changing. a. Daily news articles. b. Current prices of items. 2. The documents to be searched cannot be cataloged. a. Amazon. com generates dynamically its pages. This information is stored in the database. b. Information from Amazon. com must be obtained by queering the system. 04 December 2020 Veton Këpuska 71

N-FSA for Text Search Problem statement: u For given a set of words –

N-FSA for Text Search Problem statement: u For given a set of words – keywords, we would like to find occurrences of any of these words. u Design a N-FSA which signals by entering an accepting state, that it has seen one of the keywords. u The text is fed, one character at a time. 04 December 2020 Veton Këpuska 72

N-FSA for Text Search u The FSA can be used for recognizing (we also

N-FSA for Text Search u The FSA can be used for recognizing (we also say accepting) strings in the following way. First, think of the input as being written on a long tape broken up into cells, with one symbol written in each cell of the tape, as figure below: 04 December 2020 Veton Këpuska 73

N-FSA for Text Search 1. There is a start state, q 0, with a

N-FSA for Text Search 1. There is a start state, q 0, with a transition to itself on every input symbol. 2. For each keyword a 1 a 2…ak, there are k states, say q 1 q 2…qk. n There is a transition: u from the start state, q 0, to the q 1 state for the input a 1. u from state q 1 to the q 2 state for the input a 2, etc. n State qk is an accepting state and indicates that the keyword a 1 a 2…ak has been found. 04 December 2020 Veton Këpuska 74

Recognition Process u 1. The machine starts in the start state (q 0), and

Recognition Process u 1. The machine starts in the start state (q 0), and iterates the following process: Check the next letter of the input. a. b. 2. If it matches the symbol on an arc leaving the current state, then i. cross that arc ii. move to the next state, also iii. advance one symbol in the input If we are in the accepting state (qf) when we run out of input, the machine has successfully recognized an instance of an input. If the machine never gets to the final state, a. b. c. either because it runs out of input, or it gets some input that doesn’t match an arc, or if it just happens to get stuck in some non-final state, we say the machine rejects or fails to accept an input. 04 December 2020 Veton Këpuska 75

Example u u u Imagine that you have become a passionate fan of woodchucks.

Example u u u Imagine that you have become a passionate fan of woodchucks. Desiring more information on this celebrated woodland creature, you turn to your favorite Web browser and type in woodchuck. Your browser returns a few sites. You have a flash of inspiration and type in woodchucks. Instead of having to do this search twice, you would have rather typed one search command specifying something like woodchuck with an optional final s. Or perhaps you might want to search for all the prices in some document; you might want to see all strings that look like $199 or $25 or $24. 99. 04 December 2020 Veton Këpuska 76

Example u Design an N-FSA to recognize occurrences of the words “web” and “ebay”.

Example u Design an N-FSA to recognize occurrences of the words “web” and “ebay”. The design follows the “recipe” given in previous slides. 04 December 2020 Veton Këpuska 77

Homework u Exercise 2. 4. 2 04 December 2020 Veton Këpuska 78

Homework u Exercise 2. 4. 2 04 December 2020 Veton Këpuska 78

Implementation Approaches to N-FSA 1. Implement a program that simulates N-FSA by computing the

Implementation Approaches to N-FSA 1. Implement a program that simulates N-FSA by computing the set of sates it is in after reading each input symbol. 2. Convert N-FSA to an equivalent D-FSA. Then simulate the D-FSA directly. 04 December 2020 Veton Këpuska 79

Direct Implementation Approaches to N-FSA u There is a problem of (wrong) choice in

Direct Implementation Approaches to N-FSA u There is a problem of (wrong) choice in non-deterministic FSA. There are three standard solutions to the problem of non-determinism: n n n Backup: Whenever we come to a choice point, we could put a marker to mark where we were in the input, and what state the automaton was in. Then if it turns out that we took the wrong choice, we could back up and try another path. Look-ahead: We could look ahead in the input to help us decide which path to take. Parallelism: Whenever we come to a choice point, we could look at every alternative path in parallel. u We will focus here on the backup approach and defer discussion of the look-ahead and parallelism approaches to later chapters. 04 December 2020 Veton Këpuska 80

Back-up Approach for N-FSA Recognizer u The backup approach suggests that we should make

Back-up Approach for N-FSA Recognizer u The backup approach suggests that we should make choices that might lead to dead-ends, knowing that we can always return to unexplored alternative choices. u There are two keys to this approach: 1. Must know ALL alternatives for each choice point. 2. Store sufficient information about each alternative so that we can return to it when necessary. 04 December 2020 Veton Këpuska 81

Back-up Approach for NFSA Recognizer u When a backup algorithm reaches a point in

Back-up Approach for NFSA Recognizer u When a backup algorithm reaches a point in its processing where no progress can be made: n Runs out of input, or n Has no legal transitions, It returns to a previous choice point and selects one of the unexplored alternatives and continues from there. u To apply this notion to current definition of FSA we need only to store two things for each choice point: n The State (or node) n Corresponding position on the tape. 04 December 2020 Veton Këpuska 82

FSA With -Transitions u In order to implement the search algorithm we need to

FSA With -Transitions u In order to implement the search algorithm we need to introduce another extension to FSA. u This new feature is to allow a transition on , the empty string. u This transition does not expand the class of languages that can be accepted by finite automata but does give us some added “programming convenience” 04 December 2020 Veton Këpuska 83

Uses of -Transitions u We shall begin with an informal treatment of -N-FSAs using

Uses of -Transitions u We shall begin with an informal treatment of -N-FSAs using transition diagrams with allowed as a label. u In the examples to follow, think of the automaton as accepting those sequences of labels along paths from the start state to an accepting state. However, each Transition is “invisible”; i. e. , it contributes nothing to the string along the path. 04 December 2020 Veton Këpuska 84

Uses of -Transitions u The N-FSA below has -Transitions. This N-FSA accepts decimal numbers

Uses of -Transitions u The N-FSA below has -Transitions. This N-FSA accepts decimal numbers consisting of: 1. An optional + or – sign, 2. A string of digits, 3. A decimal point “. ”, and 4. Another string of digits. Either this string of digits or the string (2) can be empty, but at least one of the two strings of digits must be non-empty. 04 December 2020 Veton Këpuska 85

Example of Keyword Spotter u N-FSA for recognizing the keywords “web” and “ebay” implemented

Example of Keyword Spotter u N-FSA for recognizing the keywords “web” and “ebay” implemented with -Transitions. 04 December 2020 Veton Këpuska 86

Another N-FSA for “sheep” language u - transition defines the arc that couses transition

Another N-FSA for “sheep” language u - transition defines the arc that couses transition without an input symbol. Thus when in state q 3 transition to state q 2 is allowed without looking at the input symbol or advancing input pointer. u This example is another kind of non-deterministic behavior – we might not know whether to follow the transition or the ! arc. 04 December 2020 Veton Këpuska 87

The Formal Notation for -NFSA u u -NFSA are represented exactly the same as

The Formal Notation for -NFSA u u -NFSA are represented exactly the same as N-FSA’s with one augmentation that includes the information about transitions on . An -NFSA A is represented by: A=(Q, , , q 0, F) The only difference is reflected in , that is now a function that takes as arguments: 1. A state in Q, and 2. A member of { } – an input symbol of the alphabet or the symbol . Note that the symbol empty string symbol can not be now part of the alphabet , to avoid possible confusions. 04 December 2020 Veton Këpuska 88

Example u Decimal number example -NFSA is represented formally as: E = ({q 0,

Example u Decimal number example -NFSA is represented formally as: E = ({q 0, q 1, …, q 5}, {. , +, -, 0, 1, …, 9}, , q 0, {q 5}) u Transition function, , is defined by the transition table: +, - . 0, 1, …, 9 q 0 {q 1} ∅ ∅ q 1 ∅ ∅ {q 2} {q 1, q 4} q 2 ∅ ∅ ∅ {q 3} q 3 {q 5} ∅ ∅ {q 3} q 4 ∅ ∅ {q 3} ∅ q 5 ∅ ∅ 04 December 2020 Veton Këpuska 89

Back-up Approach for NFSA Recognizer u When a backup algorithm reaches a point in

Back-up Approach for NFSA Recognizer u When a backup algorithm reaches a point in its processing where no progress can be made: n Runs out of input, or n Has no legal transitions, It returns to a previous choice point and selects one of the unexplored alternatives and continues from there. u To apply this notion to current definition of FSA we need only to store two things for each choice point: n The State (or node) n Corresponding position on the tape. 04 December 2020 Veton Këpuska 90

Search State u Combination of the node and the position specifies the search state

Search State u Combination of the node and the position specifies the search state of the recognition algorithm. u To avoid confusion, the state of automaton is called a node or machine-state. u Two changes are necessary in transition table: 1. To represent nodes that have - transitions we need to add - column, 2. Accommodate multiple transitions to different nodes from the same input symbol. Each cell entry consists of the list of destination nodes rather then a single node. 04 December 2020 Veton Këpuska 91

Epsilon-Closures of -NFSA u Formal definitions of an extended transition functions for -NFSA’s, will

Epsilon-Closures of -NFSA u Formal definitions of an extended transition functions for -NFSA’s, will be introduced next. They lead to the definition of acceptance of strings and languages by these automata. u In order to do this we need to learn a central definition called the -closure of a state. A state q is -closed by following all transitions out of q that are labeled . Those -transitions may lead to other states that also have other -transitions that must be followed, and so on, eventually finding every state that can be reached from q along any path whose arcs are all labeled . u Formally, we define the -closure ECLOSE(q) recursively. As follows: 04 December 2020 Veton Këpuska 92

Epsilon-Closures of -NFSA BASIS: u State q is in ECLOSE(q). INDUCTION: u If state

Epsilon-Closures of -NFSA BASIS: u State q is in ECLOSE(q). INDUCTION: u If state p is in ECLOSE(q). And there is a transition from state p to state r labeled , then r is in ECLOSE(q). If is the transition function of the -NFSA involved, and p is in ECLOSE(q), then ECLOSE(q) also contains all the sates in (p, ) 04 December 2020 Veton Këpuska 93

Example u Each state in the above -NFSA is its own -closure, with two

Example u Each state in the above -NFSA is its own -closure, with two exceptions: n ECLOSE(q 0)={q 0, q 1} & n ECLOSE(q 3)={q 3, q 5} 04 December 2020 Veton Këpuska 94

Example ECLOSE(1)={1, 4, 2, 3, 6} 04 December 2020 Veton Këpuska 95

Example ECLOSE(1)={1, 4, 2, 3, 6} 04 December 2020 Veton Këpuska 95

Extended Transitions and Languages for -NFSA u The -closure explains transitions of an -NFSA

Extended Transitions and Languages for -NFSA u The -closure explains transitions of an -NFSA for a given sequence of (non- ) inputs. This helps defining what it means for an -NSFA to accept an input. u Suppose that u is the extended transition function. This function provides the set of states that can be reached along a path whose labels, when concatenated, for the string w. n Note that links do not contribute to w. 04 December 2020 E=(Q, , , q 0, F) is an -NFSA. Veton Këpuska 96

Recursive Definition of BASIS: u u If the label of the path is ,

Recursive Definition of BASIS: u u If the label of the path is , then we can follow only -labeled arcs extending from state q; Note: that is exactly what ECLOSE(q) does. 04 December 2020 Veton Këpuska 97

Recursive Definition of INDUCTION : u Suppose string w, is of the form xa,

Recursive Definition of INDUCTION : u Suppose string w, is of the form xa, where a is the last symbol of w. Since a is a member of ; it cannot be . u is computed as follows: 1. Let {p 1, p 2, p 3, …, pk} be. The pi’s, are all and the only states that we can reach from q following a path labeled x. This path may end with one or more transitions labeled , and may have other -labeled transitions as well. 2. Let be the set {r 1, r 2, r 3, …, rm} – reaching all the states that can be reached from q along the paths labeled x with the input a. 3. Then. Additional closure step includes all the paths from q labeled w, by considering the possibility that there additional -labeled arcs that we can follow after making a transition on the final symbol a of w. 04 December 2020 Veton Këpuska 98

Example Compute (i. e. , w=“ 5. 6”) 04 December 2020 Veton Këpuska u

Example Compute (i. e. , w=“ 5. 6”) 04 December 2020 Veton Këpuska u 99

Example 1. 2. 1. 1. 2. 04 December 2020 2. Veton Këpuska 100

Example 1. 2. 1. 1. 2. 04 December 2020 2. Veton Këpuska 100

Deterministic & Non. Deterministic FSAs Deterministic FSA Non-Deterministic FSA 04 December 2020 Veton Këpuska

Deterministic & Non. Deterministic FSAs Deterministic FSA Non-Deterministic FSA 04 December 2020 Veton Këpuska 101

Another N-FSA for “sheep” language u - transition defines the arc that cases transition

Another N-FSA for “sheep” language u - transition defines the arc that cases transition without an input symbol. Thus when in state q 3 transition to state q 2 is allowed without looking at the input symbol or advancing input pointer. u This example is another kind of non-deterministic behavior – we might not know whether to follow the - transition or the ! arc. 04 December 2020 Veton Këpuska 102

Using NFSA to Accept Strings u There is a problem of (wrong) choice in

Using NFSA to Accept Strings u There is a problem of (wrong) choice in non-deterministic FSA. There are three standard solutions to the problem of non-determinism: n n n Backup: Whenever we come to a choice point, we could put a marker to mark where we were in the input, and what state the automaton was in. Then if it turns out that we took the wrong choice, we could back up and try another path. Look-ahead: We could look ahead in the input to help us decide which path to take. Parallelism: Whenever we come to a choice point, we could look at every alternative path in parallel. u We will focus here on the backup approach and defer discussion of the look-ahead and parallelism approaches to later chapters. 04 December 2020 Veton Këpuska 103

Back-up Approach for NFSA Recognizer u The backup approach suggests that we should make

Back-up Approach for NFSA Recognizer u The backup approach suggests that we should make choices that might lead to dead-ends, knowing that we can always return to unexplored alternative choices. u There are two keys to this approach: 1. Must know ALL alternatives for each choice point. 2. Store sufficient information about each alternative so that we can return to it when necessary. 04 December 2020 Veton Këpuska 104

Back-up Approach for NFSA Recognizer u When a backup algorithm reaches a point in

Back-up Approach for NFSA Recognizer u When a backup algorithm reaches a point in its processing where no progress can be made: n Runs out of input, or n Has no legal transitions, It returns to a previous choice point and selects one of the unexplored alternatives and continues from there. u To apply this notion to current definition of FSA we need only to store two things for each choice point: n The State (or node) n Corresponding position on the tape. 04 December 2020 Veton Këpuska 105

Search State u Combination of the node and the position specifies the search state

Search State u Combination of the node and the position specifies the search state of the recognition algorithm. u To avoid confusion, the state of automaton is called a node or machine-state. u Two changes are necessary in transition table: 1. To represent nodes that have - transitions we need to add - column, 2. Accommodate multiple transitions to different nodes from the same input symbol. Each cell entry consists of the list of destination nodes rather then a single node. 04 December 2020 Veton Këpuska 106

The Transition table of -NFSA Input State b a ! →q 0 {q 1}

The Transition table of -NFSA Input State b a ! →q 0 {q 1} Ø Ø Ø q 1 Ø {q 2} Ø Ø q 2 Ø {q 2, q 3} Ø Ø q 3 Ø Ø {q 4} {q 2} *q 4 Ø Ø 04 December 2020 Veton Këpuska 107

 -NFSA Recognition Algorithm function ND-RECOGNIZE(tape, machine) returns accept or reject agenda←{(Initial state of

-NFSA Recognition Algorithm function ND-RECOGNIZE(tape, machine) returns accept or reject agenda←{(Initial state of machine, beginning of tape)} current-search-state←NEXT(agenda) loop if ACCEPT-STATE? (current-search-state) returns true then return accept else agenda← agenda ∪ GENERATE-NEW-STATES(current-search-state) if agenda is empty then return reject else current-search-state←NEXT(agenda) end 04 December 2020 Veton Këpuska 108

 -NFSA Recognition Algorithm function GENERATE-NEW-STATES(current-state) returns a set of search-states current-node←the node the

-NFSA Recognition Algorithm function GENERATE-NEW-STATES(current-state) returns a set of search-states current-node←the node the current search-state is in index←the point on the tape the current search-state is looking at return a list of search states from transition table as follows: (transition-table[current-node, ], index) ∪ (transition-table[current-node, tape[index]], index + 1) function ACCEPT-STATE? (search-state) returns true or false current-node←the node search-state is in index←the point on the tape search-state is looking at if index is at the end of the tape and current-node is an accept state of machine then return true else return false 04 December 2020 Veton Këpuska 109

Possible execution of NDRECOGNIZE 04 December 2020 Veton Këpuska 110

Possible execution of NDRECOGNIZE 04 December 2020 Veton Këpuska 110

Recognition as Search u ND-RECOGNIZE accomplishes the task of recognizing strings in a regular

Recognition as Search u ND-RECOGNIZE accomplishes the task of recognizing strings in a regular language by providing a way to systematically explore all the psossible paths through a machine. u This kind of solutions are known as state-space search algorithms. u The key to the effectiveness of such programs is often the order which the states in the space are considered. A poor ordering of states may lead to the examination of a large number of unfruitful states before a successful solution is discovered. n n n Unfortunately typically it is not possible to tell a good choice from a bad one, and often the best we can do is to insure that each possible solution is eventually considered. Node that the ordering of states is left unspecified in NDRECONGIZE (NEXT function). Thus critical to the performance of the algorithm is the implementation of NEXT function. 04 December 2020 Veton Këpuska 111

Depth-First-Search u Depth-First-Search or Last-In-First. OUT (LIFO). u Next return the state at the

Depth-First-Search u Depth-First-Search or Last-In-First. OUT (LIFO). u Next return the state at the front of the agenda. u Pitfall: Under certain circumstances they can enter an infinite loop. 04 December 2020 Veton Këpuska 112

Breadth-First Search u Breadth-First Search or First In First Out (FIFO) strategy. n All

Breadth-First Search u Breadth-First Search or First In First Out (FIFO) strategy. n All possible choices explored at once. u Pitfalls: As with depth-first if the statespace is infinite, the search may never terminate. More importantly due to growth in the size of the agenda if the state-space is even moderately large, the search may require an impractically large amount of memory. n For larger problems, more complex search techniques such as dynamic programming or A* must be used. 04 December 2020 Veton Këpuska 113

Possible execution of NDRECOGNIZE 04 December 2020 Veton Këpuska 114

Possible execution of NDRECOGNIZE 04 December 2020 Veton Këpuska 114

Regular Languages and FSA u The class of languages that is definable by Regular

Regular Languages and FSA u The class of languages that is definable by Regular Expressions (studied next) is exactly the same as the class of languages that are characterizable by finite-state automata: n Those languages are called Regular Languages. 04 December 2020 Veton Këpuska 115

Definition of the language L(E) of an -NFSA u -NFSA: E=(Q, , , q

Definition of the language L(E) of an -NFSA u -NFSA: E=(Q, , , q 0, F) u u The language of E is the set of strings w that take the start state to at least one accepting state. 04 December 2020 Veton Këpuska 116

Formal Definition of Regular Languages u u - alphabet = set of symbols in

Formal Definition of Regular Languages u u - alphabet = set of symbols in a language. - empty string Ø – empty set. The of regular languages (or regular sets) over is then formally defined as follows: 1. 2. 3. Ø is a regular language ∀a ∈ ∪ , {a} is a regular language If L 1 and L 2 are regular languages, then so are: a) L 1˙L 2 = {xy|x ∈ L 1, y ∈ L 2}, the concatenation of L 1 and L 2 b) L 1 ∪ L 2, the union or disjunction of L 1 and L 2 c) L 1*, the * closure of L 1. u All and only the sets of languages which meet the above properties are regular languages. 04 December 2020 Veton Këpuska 117

Regular Languages and FSAs u All regular languages can be implemented by the three

Regular Languages and FSAs u All regular languages can be implemented by the three operations which define regular languages: n Concatenation n Disjunction|Union (also called “|”), n * closure. u Example: n (*, +, {n, m}) are just a special case of repetition plus * closure. n All the anchors can be thought of as individual special symbols. n The square braces [] are a kind of disjunction: u [ab] means “a or b”, or u The disjunction of a and b. 04 December 2020 Veton Këpuska 118

Regular Languages and FSAs u Regular languages are also closed under the following operations:

Regular Languages and FSAs u Regular languages are also closed under the following operations: n Intersection: if L 1 and L 2 are regular languages, then so is L 1 ∩ L 2, the language consisting of the set of strings that are in both L 1 and L 2. n Difference: if L 1 and L 2 are regular languages, then so is L 1 – L 2, the language consisting of the set of strings that are in L 1 but not L 2. n Complementation: if L 1 and L 2 are regular languages, then so is *-L 1, the set of all possible strings that are not in L 1. n Reversal: if L 1 is regular language, then so is L 1 R, the language consisting of the set of reversals of the strings that are in L 1. 04 December 2020 Veton Këpuska 119

Regular Expressions and FSA u The regular expressions are equivalent to finite-state automaton (Proof:

Regular Expressions and FSA u The regular expressions are equivalent to finite-state automaton (Proof: Hopcroft and Ullman 1979). u Proof is inductive. Each primitive operations of a regular expression (concatenation, union, closure) is shown as part of inductive step of the proof: 04 December 2020 Veton Këpuska 120

Concatenation u FSAs next to each other by connecting all the final states of

Concatenation u FSAs next to each other by connecting all the final states of FSA 1 to the initial state of FSA 2 by an -transition 04 December 2020 Veton Këpuska 121

Closure u Repetition: All final states of the FSA back to the initial states

Closure u Repetition: All final states of the FSA back to the initial states by -transition u Zero occurrences case: Direct link from the initial state to final state 04 December 2020 Veton Këpuska 122

Union u Add a single new initial state q 0, and add new transitions

Union u Add a single new initial state q 0, and add new transitions from it to the former initial states of the two machines to be joined 04 December 2020 Veton Këpuska 123

Summary u This chapter introduced the most important fundamental concept of the finite automaton,

Summary u This chapter introduced the most important fundamental concept of the finite automaton, and equivalent regular language. Here’s a summary of the main points we covered about these ideas: u Any regular language can be realized as a finite state automaton (FSA). Thus, an automaton implicitly defines a formal language as the set of strings the automaton accepts. u An automaton can use any set of symbols for its vocabulary, including letters, words, or even graphic images. 04 December 2020 Veton Këpuska 124

Summary u The behavior of a deterministic automaton (D-FSA) is fully determined by the

Summary u The behavior of a deterministic automaton (D-FSA) is fully determined by the state it is in. u A non-deterministic automaton (N-FSA) sometimes has to make a choice between multiple paths to take given the same current state and next input. u Any N-FSA can be converted to a D-FSA. u The order in which a N-FSA chooses the next state to explore on the agenda defines its search strategy. n The depth-first search or LIFO strategy corresponds to the agenda-as-stack; n The breadth-first search or FIFO strategy corresponds to the agenda-as-queue. n A* (Dynamic Programming Algorithm) Search. u Any regular language can be automatically compiled into a N-FSA and hence into FSA 04 December 2020 Veton Këpuska 125

End Veton Këpuska

End Veton Këpuska