Properties of Regular Languages Pratit Santiprabhob Properties of
Properties of Regular Languages Pratit Santiprabhob Properties of Regular Languages
First question Is the set/family of regular languages broad enough to cover all formal languages? In other words, can Finite Automata be used to define all formal languages? The answer is definitely NO! Properties of Regular Languages 1
Other questions What will happen when we perform certain operations, e. g. set operations, concatenation, etc. on regular languages? What kind of questions about regular languages that we have answers, e. g. is a given language finite or infinite? How can we tell whether a given language is regular or not? Looking at various properties of regular languages, we shall understand what regular languages can and cannot do! Properties of Regular Languages 2
Topics to be discussed Closure properties of regular languages ¡ Standard representation of regular languages ¡ Identifying non-regular languages via Pumping Lemma ¡ Properties of Regular Languages 3
Closure properties Given two regular languages L 1 and L 2 is the language L 3 resulted from a given operation between L 1 and L 2 regular? If it is, then we say that the set/family of regular languages is closed under the operation! Properties of Regular Languages 4
Closure under simple set operations Theorem 4. 1 If L 1 and L 2 are regular languages, then so are L 1 L 2, L 1 L 2, L 1 and L 1. We say that the family of regular languages is closed under union, intersection, concatenation, complementation and star-closure. Proof idea If L 1 and L 2 are regular languages, then there exist regular expressions r 1 and r 2 such that L 1 = L(r 1) and L 2 = L(r 2). By definition, r 1 r 2, r 1 r 2 and r 1 are regular expressions denoting the languages L 1 L 2, L 1 L 2, and L 1 , respectively. Thus, closure under union, concatenation and star-closure follows immediately. Properties of Regular Languages 5
Closure under complementation For closure under complementation, let M = (Q, , , q 0, F) be a DFA that accepts L 1. Then, the DFA M’ = (Q, , , q 0, Q F) accepts L 1. Note that is assumed to be a total function; hence, (q 0, w) is defined for all w . Consequently, either (q 0, w) F; then w L 1, or (q 0, w) Q F; then w L 1. Properties of Regular Languages 6
Closure under intersection (1) Let L 1 = L(M 1) and L 2 = L(M 2), where M 1 = (Q, , , q 0, F 1) and M 2 = (P, , , p 0, F 2) are DFAs. Construct M’ = (Q’, , ’, (q 0, p 0), F’) from M 1 and M 2 such that Q’ = Q P consisting of pairs (qi, pj) with ’ such that M’ is in state (qi, pj) whenever M 1 is in state qi and M 2 is in state pj; in other words, ’((qi, pj), a) = (qk, pl) whenever (qi, , a) = qk and (pj, a) = pl. F’ is defined as the set of all (qi, pj) such that qi F 1 and pj F 2. Hence, w L 1 L 2 if and only if it is accepted by M’ Properties of Regular Languages 7
Closure under intersection (2) Alternatively, we can use De. Morgan’s Law together with the fact that the set/family of regular languages is closed under union and complementation! Let L 1 and L 2 be regular languages. So are L 1 and L 2, then so is L 1 L 2. Taking the complement of the resulting language, we have L 1 L 2 which is also a regular language! Note L 1 L 2 = complement of L 1 L 2 Properties of Regular Languages 8
Example 4. 1 Show that the set/family of regular languages is closed under difference; i. e. if L 1 and L 2 are regular, then L 1 L 2 is also regular Proof L 1 L 2 = L 1 L 2 Properties of Regular Languages 9
Closure under reversal Theorem 4. 2 The set/family of regular languages is closed under reversal. Proof idea Given a regular language L, we can always construct an NFA N with a single final state to accept L. Construct an NFA N’ such that its initial state and final state are N’s final state and N’s initial state, respectively; and all edges of the transition graph are reversed from those of N! Hence, N’ accepts w. R if and only if N accepts w. Properties of Regular Languages 10
Homomorphism Definition 4. 1 Suppose and are alphabets. Then a function h: is called a homomorphism. In words, a homomorphism is a substitution in which a single letter is replaced with a string. The domain of the function h is extended to strings in an obvious fashion; if w = a 1 a 2…an, then h(w) = h(a 1)h(a 2)…h(an). If L is a language on , then its homomorphic image is defined as Properties Regular Languages 11 h(L)of={h(w) | w L}
Example 4. 2 Let = {a, b} and = {a, b, c} and h is defined as h(a) = ab, h(b) = bbc. Then, h(aba) = abbbcab, and, the homomorphic image of L = {aa, aba} is h(L) = {abab, abbbcab} Properties of Regular Languages 12
Example 4. 3 If r is a regular expression for a language L, then a regular expression for h(L) can be obtained by applying homomorphism to each symbol of r. Let = {a, b} and = {b, c, d} and h is defined as h(a) = dbcc, h(b) = bdc. If L is the regular language denoted by r = (a b )(aa) , then h(L) is the regular language denoted by h(r) = r 1 = (dbcc (bdc) )(dbcc). Properties of Regular Languages 13
Closure under homomorphism Theorem 4. 3 Let h be a homomorphism. If L is a regular language, then its homomorphic image h(L) is also regular. The set/family of regular languages is therefore closed under homomorphism. Proof idea Let L be denoted by some regular expression r; h(r) can be obtained by substituting each symbol a of r. Per the definition of regular expression, the result is clearly another regular expression. Then, need to show that for every w L, its corresponding h(w) L(h(r)) and for every v L(h(r)) there is w L such that v = Properties of Regular Languages 14 h(w).
Right quotient Definition 4. 2 Let L 1 and L 2 be languages on the same alphabet. Then, the right quotient of L 1 with L 2 is defined as L 1/L 2 = {x | xy L 1 for some y L 2} To get the right quotient of L 1 with L 2, we take every string in L 1 that has a suffix belonging to L 2 and remove such a suffix. Properties of Regular Languages 15
Example 4. 4 Let L 1 = {anbm | n 1, m 0} {ba} and L 2 = {bm | m 1}, then L 1/L 2 = {anbm | n 1, m 0}. Note that strings in L 2 consist of one or more b’s; hence, strings in L 1/L 2 are those from L 1 originally terminating with at least one b having one or more b’s removed! In terms of DFA, we want to locate each state q of M 1 (accepting L 1) that has a walk labeled v to a final state where v L 2. Note that any x such that (q 0, x) = q is a member of L 1/L 2. We can mark such state q a final state of a new DFA accepting L 1/L 2. Properties of Regular Languages 16
Example 4. 4 DFA’s Properties of Regular Languages 17
Closure under right quotient (1) Theorem 4. 4 If L 1 and L 2 are regular languages, then L 1/L 2 is also regular. The set/family of regular languages is closed under right quotient with a regular language. Proof idea Let L 1 = L(M) where M = (Q, , , q 0, F) is a DFA. Construct another DFA M’ = (Q, , , q 0, F’); for each qi Q, determine if there exists a y L 2 such that (qi, y) = qf F; then, add the qi to F’. For each qi, we can create Mi = (Q, , , qi, F) from M by making qi the initial state; then, check whether L(Mi) is not empty! 18 Properties of. LRegular 2 Languages
Closure under right quotient (2) We then need to prove that L(M’) = L 1/L 2; let x be an element of L 1/L 2, then there must be a y L 2 such that xy L 1 implying (q 0, xy) F, there must be some q Q such that (q 0, x) = q, and (q, y) F. By the aforementioned construction, q F’ and M’ accepts x since (q 0, x) F’ Properties of Regular Languages 19
Closure under right quotient (3) Conversely, for any x accepted by M’, we have (q 0, x) = q F’. Again, by the construction, this implies that there exists a y L 2 such that (q, y) F. Therefore, xy L 1 and x L 1/L 2; hence, L(M’) = L 1/L 2. Consequently, L 1/L 2 is regular. Properties of Regular Languages 20
Example 4. 5 (1) Find L 1/L 2 for L 1 = L(a baa ), L 2 = L(ab ). First, construct a DFA that accepts L 1. Properties of Regular Languages 21
Example 4. 5 (1) L(M 0) L(M 1) L(M 2) L(M 3) L 2 L 2 {a} L(M’) = L(a b a baa ) = L(a ba ) = L 1/L 2 Properties of Regular Languages 22
Standard representations of regular languages A regular language is given in a standard representation if and only if it is described by l l l a finite automaton, a regular expression, or a regular grammar Only when a language is described in a standard representation, it is sufficiently well-defined for mathematical manipulation such as in theorems. Properties of Regular Languages 23
Membership test Theorem 4. 5 Given a standard representation of any regular language L on and any w , there exists an algorithm for determining whether or not w is in L. Proof idea First represent the language by a DFA, then test w to see if it is accepted by the DFA. Properties of Regular Languages 24
Testing whether a language is empty, finite, or infinite Theorem 4. 6 There exists an algorithm for determining whether a regular language given in a standard representation is empty, finite, or infinite. Proof idea Represent the language as a DFA and examine its transition graph. l l If there is a simple path from the initial vertex to any final vertex, the language is not empty. Find all the vertices that are the base of some cycle, if any of these are on a path from the initial vertex to any final vertex, the language is infinite; otherwise it is finite. Properties of Regular Languages 25
Testing for equality Theorem 4. 7 Given standard representations of two regular languages L 1 and L 2, there exists an algorithm to determine whether or not L 1 = L 2. Proof idea With L 1 and L 2, L 3 is defined as L 3 = ( L 1 L 2 ) By closure properties, L 3 is regular and there is a DFA M to accept L 3; then, Theorem 4. 6 can be used to test whether L 3 is empty or not. Note that L 3 = if and only if L 1 = L 2. Properties of Regular Languages 26
Identifying nonregular languages Regular languages can be infinite ¡ Finite automata have finite memory ¡ This imposes some restrictions on the structure of a regular language! ¡ Hence, a language is regular only if, in processing any string, the information that has to be remembered at any stage is strictly limited ¡ Properties of Regular Languages 27
Pigeonhole Principle Pigeonhole principle is a simple mathematical observation that when putting n objects into m boxes (pigeonholes), if n m, then at least one box must have more than one item in it. Properties of Regular Languages 28
Example 4. 6 (1) Is the language L = {anbn | n 0} regular? No; we can prove this by contradiction. Assume that L is regular, then some DFA M = {Q, {a, b}, , q 0, F} exists for L. Look at (q 0, ai) for i = 1, 2, 3, … ; since the number of i’s may be infinite, but there is only a finite number of states in M, by pigeonhole principle, there must be some state q such that (q 0, an) = q and (q 0, am) = q where n Properties m. of Regular Languages 29
Example 4. 6 (2) But since M accepts anbn, we must have (q, bn) = qf F From this we can conclude that (q 0, ambn) = ( (q 0, am), bn) = (q, bn) = qf F This, however, contradict our original assumption that M accepts ambn only if n m; hence, L cannot be regular! Note that an automaton with a finite number of internal states cannot differentiate between all prefixes an and a m! Properties of Regular Languages 30
Pumping Lemma for regular languages (1) Theorem 4. 8 Let L be an infinite regular language. Then there exists some positive integer m such that any w L with |w| m can be decomposed as w = xyz with |xy| m and |y| 1 such that wi = xyiz is also in L for all i = 0, 1, 2, … In other words, every sufficiently long string in L can be broken into three parts in such a way that an arbitrary number of repetitions of the middle part yields another string in L; the middle part is said to be pumped!Properties of Regular Languages 31
Pumping Lemma for regular languages (2) Proof idea If L is regular, there must be a DFA that accepts/recognizes it. Let such a DFA have states labeled q 0, q 1, q 2, …, qn. Take a string w L such that |w| m n 1. Consider the sequence of states that the DFA goes through when processing w, q 0, qi, qj, …, qf. The sequence has exactly |w| 1 entries; at least one state must be repeated, and the repetition must start no later than the nth move. The sequence must look like q 0, qi, qj, …, qr, …, qf. Properties of Regular Languages 32
Pumping Lemma for regular languages (3) There must be some substrings x, y, z of w such that (q 0, x) qr, (qr, y) qr, (qr, z) qf, with |xy| n 1 = m and |y| 1. It then immediately follows that (q 0, xz) qf, (q 0, xy 2 z) qf, (q 0, xy 3 z) qf, and so on. Properties of Regular Languages 33
Use of pumping lemma The pumping lemma is meant to be used for proving that a given language is nonregular by means of contradiction! ¡ The lemma is not meant and cannot be used to prove that a given language is regular. ¡ Note that the pumping lemma is also vacuously true for finite regular languages. ¡ Properties of Regular Languages 34
Example 4. 7 Use pumping lemma to show that L = {anbn | n 0} is not regular. Assume that L is regular, therefore, the pumping lemma must hold. Pick a string w = ambm, making n = m; consequently, the substring |y| must consist of only a’s. Suppose |y| = k, with i = 0, we have w 0 = am-kbm which is not in L; hence, a contradiction to the pumping lemma. Therefore, L is not regular! Properties of Regular Languages 35
Notes on the use of pumping lemma We know m exists, but don’t know the value ¡ We also know that w can be decomposed into xyz, but don’t know where they so are ¡ Use of specific values of m or xyz is no good! ¡ The key is the choice of w that we choose to start our proof with! ¡ Properties of Regular Languages 36
Example 4. 8 Show that L = {ww. R | w } is not regular. Choose w as shown below The substring y shall consist of a’s only; hence, pumping i = 0 will cause the number of a’s on the left to be fewer than that on the right! w 0 is then not in L. Properties of Regular Languages 37
Example 4. 9 Let = {a, b}, show that L = {w | na(w) nb(w)} is not regular. Pick w = ambm+1 Forcing y to be of the form y = ak, 1 k m Then, pump up using i = 2, we get w 2 = am+kbm+1 which is not in L. Properties of Regular Languages 38
Example 4. 10 Show that the language L = {(ab)nak | n k, k 0} is not regular. Pick w = (ab)m+1 am If y = a or y = b, pump with i = 0, then the resulting string w 0 is not of the form (ab)nak and it is not in L. If y = ab, pump with i = 0, then the resulting string w 0 = (ab)mam L. If y = (ab)l for some 1 l m, then the resulting string w 0 = (ab)m+1 -lam L. If y is anything else, the resulting string w 0 is not of the form (ab)nak and it is not in L. Properties of Regular Languages 39
Example 4. 11 Show that the language L = {an | n is a perfect square} is not regular. Pick w = am 2 Making y of the form y = ak, 1 k m Pump with i = 0, w 0 = a m 2 – k Note that m 2 – k (m – 1)2, therefore, w 0 cannot be in L. Properties of Regular Languages 40
Example 4. 12 Show that the language L = {anbkcn+k | n 0, k 0} is not regular. Apply the pumping lemma directly shall prove this; alternatively the closure under homomorphism can be utilized. Take h(a) = a, h(b) = a, h(c) = c, then h(L) = {an+kcn+k | n 0, k 0} = {aici | i 0} which is know to be not regular; hence L is not regular either. Properties of Regular Languages 41
Example 4. 13 (1) Show that the language L = {anbl | n l} is not regular. Need to pick w = am!b(m+1)! making y = ak, 1 k m Then, need to pick i to pump such that m! + (i – 1)k = (m + 1)! Hence, i = 1 + mm!/k Since k m, i is always an integer; we will get wi = a(m+1)!b(m+1)! L Properties of Regular Languages 42
Example 4. 13 (2) Alternatively, try picking w = ambm+m! and making i = m!/k + 1 Yet as another alternative, we can use closure properties to help Assume that L is regular, create L 1 such that L 1 = L L(a b ) L 1 would also be regular. However, observe that L 1 = {anbn | n 0} which is known to be not regular! Therefore, L cannot be regular. Properties of Regular Languages 43
Assignments to be turned in on July 2, 2007 ¡ Determine whether or not the following language is regular L = {anbn| n 1} {anbm | n 1, m 1} Prove you answer. (Section 4. 3 exercise 6 (a) page 122) Properties of Regular Languages 44
- Slides: 45