Giorgi Japaridze Theory of Computability Regular Languages Chapter
Giorgi Japaridze Theory of Computability Regular Languages Chapter 1
1. 1. a Giorgi Japaridze How a finite automaton works 1 q 0 1 0 q 1 0 01100 q 2 Theory of Computability
1. 1. b Giorgi Japaridze Theory of Computability The language of a machine 1 q 0 1 q 2 0 q 1 0 L(M), “the language of M”, or “the language recognized by M” --- the set of all strings that the machine M accepts What is the language recognized by our automaton A?
1. 1. c Giorgi Japaridze Theory of Computability Formal definition of a finite automaton A (deterministic) finite automaton (DFA) is a 5 -tuple (Q, , , s, F), where: Q is a finite set whose elements are called the states is a finite set called the alphabet is a function of the type Q Q called the transition function s is an element of Q called the start state F is a subset of Q called the set of accept states
1. 1. d Giorgi Japaridze Theory of Computability Our automaton formalized 1 q 0 Q: 1 : : 0 1 q 2 0 q 1 q 2 s: F: 0 A = (Q, , , s, F)
1. 1. e Giorgi Japaridze Formal definition of computation M = (Q, , , s, F) 1 q 0 1 of states such that: u 1 u 2 … un q 2 1 0 0 r 1, r 2, …, rn+1 • r 1=s • ri+1 = (ri, ui), for each i with 1 i n • rn+1 F 0 q 1 M accepts the string iff there is a sequence Theory of Computability u 1 u 2 … un 0 1 1 0 0 q 2 q 0 r 1, r 2, …, q 0 q 2 rn , q 1 rn+1
1. 1. f Giorgi Japaridze Theory of Computability Designing finite automata Task: Design an automaton that accepts a bit string iff it contains an even number of “ 1”s.
1. 2. a Giorgi Japaridze NFAs (Nondeterministic Finite Automata) q 1 1 q 2 0, 1 q 3 0, 1 01010 q 1 1 q 1 q 2 q 1 q 3 0 Theory of Computability
1. 2. a Giorgi Japaridze NFAs (Nondeterministic Finite Automata) q 1 1 q 2 0, 1 What language does this NFA recognize? 0, 1 q 3 Theory of Computability
1. 2. b Giorgi Japaridze Theory of Computability Formal definition of a nondeterministic finite automaton An NFA is a 5 -tuple (Q, , , s, F), where: Q is a finite set whose elements are called the states is a finite set called the alphabet is a function of the type Q P(Q) called the transition function s is an element of Q called the start state F is a subset of Q called the set of accept states
1. 2. c Giorgi Japaridze Theory of Computability Example 1 Q: : b a : a 1 b 3 a, b 2 2 3 s: F: A = (Q, , , s, F)
1. 2. d Giorgi Japaridze Theory of Computability Formal definition of accepting M = (Q, , , s, F) When M is a DFA M accepts the string iff there is a sequence of states such that: When M is an NFA M accepts the string u 1 u 2 … un r 1, r 2, …, rn+1 iff there is a sequence of states such that: u 1 u 2 … un r 1, r 2, …, rn+1 • r 1=s • ri+1 = (ri, ui), for each i with 1 i n • ri+1 (ri, ui), for each i with 1 i n • rn+1 F
1. 2. e Giorgi Japaridze What language does this NFA recognize? 0 0 0 0 Theory of Computability
1. 2. f Giorgi Japaridze Theory of Computability What language does this DFA recognize? 1 2 0 0 3 0 0 5 0 4
1. 2. g Giorgi Japaridze Theory of Computability Equivalence of NFAs and DFAs Two machines are said to be equivalent if they recognize the same language. Theorem 1. 39 Every NFA has an equivalent DFA. Proof. Consider an NFA N = (Q, , , s, F) We need construct an equivalent DFA D = (Q’, , ’, s’, F’) using a procedure called the subset construction described on the next slide.
1. 2. h Giorgi Japaridze Theory of Computability The subset construction Constructing DFA D = (Q’, , ’, s’, F’) from NFA N = (Q, , , s, F) • Q’ = P (Q) • ’(R, a) = {q | q= (p, a) for some p R} • s’ = {s} • F’= {R | R is a subset of Q containing an accept state of N} D obviously works correctly: at every step in the computation, it clearly enters a state that corresponds to the subset of states that N could be in at that point.
1. 2. i Giorgi Japaridze Theory of Computability Example of applying the subset construction Q’: N = (Q, , , s, F) : 1 ’: {1} {2} {3} {1, 2} {1, 3} {2, 3} {1, 2, 3} s’: F’: a b b a 3 a, b 2 • Q’ = P (Q) • ’(R, a) = {q | q= (p, a) for some p R} • s’ = {s} • F’= {R | R is a subset of Q containing an accept state of N}
1. 2. j Giorgi Japaridze The resulting DFA Theory of Computability D {3} b a b {1, 3} b a a, b a {1} b b {2, 3} a b {1, 2, 3} a a {2} a, b {1, 2}
1. 2. k Giorgi Japaridze Removing unreachable states Theory of Computability D {3} b a a, b a {1} b b {2, 3} a b {1, 2, 3} a
1. 2. l Giorgi Japaridze Testing in work N Theory of Computability D {3} b 1 b a a b a 3 a, b a {1} b b 2 {2, 3} a baa a, b b {1, 2, 3} a
1. 3. a Union: Regular operations Giorgi Japaridze Theory of Computability L 1 L 2 = {x | x L 1 or x L 2} {Good, Bad} {Boy, Girl} = {0, 000, …} {1, 111, …} = L = Concatenation: L 1 L 2 = {xy | x L 1 and y L 2} {Good, Bad} {Boy, Girl} = {0, 000, …} {1, 111, …} = L = Star: L* = {x 1…xk | k 0 and each xi is in L} {Boy, Girl}* = {0, 000, …}* = *=
1. 3. b Regular expressions Giorgi Japaridze Theory of Computability We say that R is a regular expression (RE) iff R is one of the following: What language is represented by the expression: 1. a, where a is a symbol of the alphabet {a} 2. { } 3. 4. (R 1) (R 2), where R 1 and R 2 are RE The union of the languages represented by R 1 and R 2 The concatenation of the languages represented by R 1 and R 2 The star of the language represented by R 1 5. (R 1) (R 2), where R 1 and R 2 are RE 6. (R 1)*, where R 1 is a RE Conventions: § The symbol is often omitted in RE § Some parentheses can be omitted. The precedence order for the operators is: * (highest), (medium), (lowest)
1. 3. c Giorgi Japaridze Theory of Computability Regular languages A language is said to be regular iff it can be represented by a regular expression. Language {11} {Boy, Girl, Good, Bad} { , 0, 00, 000, 0000, …} { , 0101, 01010101, …} {x | x = 0 k where k is a multiple of 2 or 3} {x | x is divisible by 8} {x | x MOD 4 = 3} Expression
1. 3. d Giorgi Japaridze Theory of Computability Exercising reading regular expressions Expression Language (Good Bad)(Boy Girl) (Tom Bob)_is_(good bad) {Name_is_adjective | Name is an uppercase letter followed by zero or more lowercase letters, and adjective is a lowercase letter followed by zero or more lowercase letters} 0*10* (0 1)*101(0 1)* ((0 1))*
1. 3. e Giorgi Japaridze Theory of Computability Regular languages and DFA-recognizable languages are the same Theorem 1. 54* A language is regular if and only if some NFA (DFA) recognizes it. Proof – omitted (but given in the textbook). The textbook describes an algorithm for converting any given regular expression to an equivalent NFA, and an algorithm for converting any given NFA to an equivalent regular expression.
1. 4. a Giorgi Japaridze The limitations of the power of DFAs Theory of Computability The computing power of finite automata is severely limited by the fact that their memory (= set of states) is small (= of a fixed size) while inputs can be arbitrarily large. While the memories of real computers are also finite, they are not fixed, in the sense that we assume one can always supply additional memory if needed. To summarize, DFAs are not as powerful as computers can generally be. The next slide gives several examples of non-regular languages, i. e. languages that no DFA can handle (recognize). The non-regularity of those languages can be strictly proven using the tool called pumping lemma. We omit the pumping lemma in this course (but it is in the textbook). Instead, we will simply rely on intuitive arguments. Warning: Generally one cannot safely rely on intuition when making important conclusions, because intuition can sometimes be deceptive. Only strict mathematical proofs can be trusted.
1. 4. b Non-regular languages Do the following languages look regular to you? Giorgi Japaridze Theory of Computability A = { ww | w {0, 1}* } Is not regular. Intuitively, this is so because a DFA processing a long input will have forgotten much of the previously seen part of the input when it gets to the middle of the string. But without fully remembering the first half of the string, it is impossible to tell whether the second half coincides with it or not. B = { 0 n 1 n | n 0} Is not regular. Intuitively, this is so because a DFA processing a long input 0 n 1 n will be unable to remember exactly how many 0 s it has seen by the time when the 1 s start. But without that information it is impossible to tell whether the remaining 1* part of input has the same length as the already seen 0* part. C = {w | w contains the same number of “ 0”s as “ 1”s} Is not regular. An intuitive reason here is similar to the one for language B. D = {w | w contains the same number of “ 01”s as “ 10”s} Is regular. Intuitively, it may appear to you that if C is irregular, “even more so” should be D. But you’ve been warned about the deceptiveness of intuition. The following slide shows a DFA that recognizes D, so that D is regular!
1. 4. c A DFA recognizing D Giorgi Japaridze Theory of Computability D = {w | w contains the same number of “ 01”s as “ 10”s} 1 0 0 1 1 1 0 0
- Slides: 28