CS 1110 Nate Brunelle Today Regular Expressions Questions
CS 1110 Nate Brunelle Today: Regular Expressions
Questions?
String. find() • Takes a string as an argument, and if exactly that string appears, give its index • Mystring. find(“Purple Elephant”) – “purple elephant”. find(“Purple Elephant”) • “the elephant was purple”
Wildcards • Match on/ find: – – rugs Rugs • Will not match on/find: – Rugged – rugged • We might want: – A way of saying r or R – Maybe there’s an s – Something that’s not a letter • åugçê [Rr]ugs? [^a-z. A-Z] å ç ê
Regex Pieces Operation Example Meaning Character class [Rr] or [r. R] [abcd] [^] R or r Exactly one of a, b, c, or d Just carat (^) Character Range [a-z] [a-z. A-Z] [0 -9] Exactly one character “between” a and z or “between” A and Z Any one digit Negative character class [^a] [^a-z. A-Z] [^^] Any one character that’s not an a Any one character that’s not a letter any one character that’s not a carat Optional Quantifier s? [Rr]? Maybe there’s an s, 0 or 1 s Either have one of R or r or neither OR wx|xyz One of the strings wx or xyz Star [abc]* Any number of a’s b’s and c’s at all Plus [abc]+ At least one of a’s, b’s, and c’s
Regex Pieces, Cont. Operation Example Meaning Count Range {3, 5} [ab]{2, 3} $ Between 3 and 5 (inclusive) copies of. aa, ab, ba, bb, aaa, aab, abb, baa, … This is some text# End of Text Beginning of Text ^ #This is some text Word Boundary b #This# #some# #text# Anything . Any one character • All UVA computing IDs • 2 -3 letters, number, 1 -3 letters • [a-z]{2, 3}[1 -9][a-z]{1, 3}
Give an Expression to match • All UVA computing IDs • 2 -3 letters, number, 1 -3 letters • [a-z]? [1 -9] [a-z]?
What does a for loop look like? • for [variable] in [collection]: • Variable: [a-z. A-Z]+ • [0, 1, 5, 9]
1. import re 2. Compile 3. Operate search – • Similar to string. find() finditer Findall – – • • • 0 parentheses: m. group() 1 paren: m. group(1) 2+ paren: m. groups() 4. Match Object – – In python group start end groups
Phone Numbers • ([2 -9][0 -9]{2}-)? [2 -9][0 -9]{2}-[0 -9]{4}|(([2 -9][09]{2})) ? [2 -9][0 -9]{2}-[0 -9]{4} • Things to match: – 203 -918 -8802 – (203)918 -8802 – 918 -8802 • Things to not match: – 2039188802 – 203 -188
- Slides: 11