Regular Expressions CS 154 Omer Reingold Regular Expressions
- Slides: 28
Regular Expressions CS 154, Omer Reingold
Regular Expressions Computation as simple, logical description A totally different way of thinking about computation: What is the complexity of describing the strings in the language?
Inductive Definition of Regexp Let Σ be an alphabet. We define the regular expressions over Σ inductively: For all ∊ Σ, is a regexp ε is a regexp If R 1 and R 2 are both regexps, then (R 1 R 2), (R 1 + R 2), and (R 1)* are regexps
Precedence Order: * then + Example: R 1*R 2 + R 3 = ( ( R 1* )· R 2) + R 3
Definition: Regexps Represent Languages The regexp ∊ Σ represents the language { } The regexp ε represents {ε} The regexp represents If R 1 and R 2 are regular expressions representing L 1 and L 2 then: (R 1 R 2) represents L 1 L 2 (R 1 + R 2) represents L 1 L 2 (R 1)* represents L 1*
Regexps Represent Languages For every regexp R, define L(R) to be the language that R represents A string w ∊ Σ* is accepted by R (or, w matches R ) if w ∊ L(R) Examples: 0, 010, and 01010 match (01)*0 110101110100100 matches (0+1)*0
Assume Σ = {0, 1} { w | w has exactly a single 1 } 0*10* { w | w contains 001 } (0+1)*001(0+1)*
Assume Σ = {0, 1} What language does the regexp * represent? {ε}
Assume Σ = {0, 1} { w | w has length ≥ 3 and its 3 rd symbol is 0 } (0+1)0(0+1)*
Assume Σ = {0, 1} { w | every odd position in w is a 1 } (1(0 + 1))*(1 + ε)
Assume Σ = {0, 1} { w | w has equal number of occurrences of 01 and 10} = { w | w = 1, w = 0, or w = ε, or w starts with a 0 and ends with a 0, or w starts with a 1 and ends with a 1 } Claim: A string w has equal occurrences of 01 and 10 w starts and ends with the same bit. 1 + 0 + ε + 0(0+1)*0 + 1(0+1)*1
L can be represented by some regexp L is regular
L can be represented by some regexp L is regular
L can be represented by some regexp L is regular Base Cases (R has length 1): Given any regexp R, we will construct an NFA N s. t. N accepts exactly the strings accepted by R R= Proof by induction on the length of the regexp R R=ε R=
Induction Step: Suppose every regexp of length < k represents some regular language. Consider a regexp R of length k > 1 Three possibilities for R: R = R 1 + R 2 R = R 1 R 2 R = (R 1)*
Induction Step: Suppose every regexp of length < k represents some regular language. Consider a regexp R of length k > 1 Three possibilities for R: R = R 1 + R 2 R = R 1 R 2 R = (R 1)* By induction, R 1 and R 2 represent some regular languages, L 1 and L 2 But L(R) = L(R 1 + R 2) = L 1 L 2 so L(R) is regular, by the union theorem!
Induction Step: Suppose every regexp of length < k represents some regular language. Consider a regexp R of length k > 1 Three possibilities for R: R = R 1 + R 2 R = R 1 R 2 R = (R 1)* By induction, R 1 and R 2 represent some regular languages, L 1 and L 2 But L(R) = L(R 1·R 2) = L 1· L 2 so L(R) is regular by the concatenation theorem
Induction Step: Suppose every regexp of length < k represents some regular language. Consider a regexp R of length k > 1 Three possibilities for R: R = R 1 + R 2 R = R 1 R 2 R = (R 1)* By induction, R 1 and R 2 represent some regular languages, L 1 and L 2 But L(R) = L(R 1*) = L 1* so L(R) is regular, by the star theorem
Induction Step: Suppose every regexp of length < k represents some regular language. Consider a regexp R of length k > 1 Three possibilities for R: R = R 1 + R 2 R = R 1 R 2 R = (R 1)* By induction, R 1 and R 2 represent some regular languages, L 1 and L 2 But L(R) = L(R 1*) = L 1* so L(R) is regular, by the star theorem Therefore: If L is represented by a regexp, then L is regular
Give an NFA that accepts the language represented by (1(0 + 1))* ε 1 0, 1 ε Regular expression: ( 1 (0+1))*
Generalized NFAs (GNFA) L can be represented by a regexp L is a regular language Idea: Transform an NFA for L into a regular expression by removing states and re-labeling the arcs with regular expressions Rather than reading in just letters from the string on a step, we can read in entire substrings
Generalized NFA (GNFA) Is aaabcbcba accepted or rejected? Is bcba accepted or rejected? This GNFA recognizes L(a*b(cb)*a)
NFA Add unique start and accept states
NFA While the machine has more than 2 states: Pick an internal state, rip it out and re-label the arrows with regexps, to account for paths through the missing state 0 01*0 1 0
GNFA While the machine has more than 2 states: In general: R(q 1, q 3) q 1 R(q 1, q 2)R(q , q ) 2, q 2)*R(q 2, q 3) + R(q 1, q 3) 1 2 q 2 R(q 2, q 2) R(q 2, q 3) q 3
a q 0 ε a, b b (a*b)(a+b)* a*b q 1 q 2 ε q 3 R(q 0, q 3) = (a*b)(a+b)* represents L(N)
DFAs NFAs DEFINITION Regular Languages Regular Expressions
Parting thought: Regular Languages can be defined by their closure properties
- Omer reingold
- Omer reingold
- Ordinul 154 din 2004
- Ayso nap online
- Cse 154
- Cs 154 stanford
- Diketahui volume suatu balok 154 cm
- Gfr 154 certificate
- A person's weight is 154 pounds convert this to kilograms
- Cs 154 sjsu
- Sebuah kerucut mempunyai diameter 10 cm
- Senet 154
- Uchicago cs 154
- Inst154
- Inst154
- Surah al baqarah 154-157
- Cr part 154
- Cse 154
- Homework 154
- Dr abdulkadir omer
- Omer meroz
- Omer yezdani
- Omer berkman
- Omer tripp
- Omer boyaci
- Gesda dans akademi
- Jean omer marie gabriel monnet
- Hayyam üçgeni nedir
- Ist ymm odası