Searching using regular expressions Searching using regular expressions
- Slides: 8
Searching using regular expressions
Searching using regular expressions A regular expression is also a ‘special text string’ for describing a search pattern. Regular expressions define patterns of characters that are applied to a block of text for the purpose of locating that text. Regular expression: ‘e’ 1 2 (single literal character) 3 4 5 ‘Jack and Jill went up the hill to fetch a pale of water’ (The ‘regex’ once told will go onto find further occurrences of ‘e’) Searching using regular expressions
Searching using regular expressions The special text string might take on a different form. Regular expression: ‘{b (a | e) d}’ 1 (register bad & bed) 2 3 ‘The bed had been badly put together, but I bedded down for the night all the same’ (This kind of a function could be used to search web pages and word documents for important strings) Searching using regular expressions
Simple searching Any single character matches itself, unless it is a metacharacter with a special meaning. Characters that normally function as metacharacters are preceded by a backslash when they need to be Interpreted literally. Example: ‘bed’ matches substring ‘bed’ in ‘bedded Searching using regular expressions
Dot matches almost any character The dot matches a single character without caring what that character is, unless it is the newline character. Example: Therefore, . an matches strings: - can flan ban ran Searching using regular expressions
Predefined shorthand classes In predefined classes, the metacharacter ‘’ functions to escape the normal meaning of the following characters (eg d means treat ‘d’ as a metacharacter, not as a literal character. Example: The x d matches “x 0”, “x 1”, “x 2”, “x 3”, “x 4”, “x 5” (‘d’ matches a single numeric character) Searching using regular expressions
Predefined shorthand classes Other examples include: Example: • The be w matches “bed”, “bee”, “beg”, “ben”, “bet” (w matches a single alphanumeric character (a – z, A – Z, 0 - 9) and the underscore character _. • The be W matches “be!”, “be£”, “be$”, “be%” (W matches a single non-alphanumeric character. W is short for [ᶺw] Searching using regular expressions
Predefined shorthand classes Other examples include: Example: • /D matches a single non numeric character and is equivalent to [ᶺd] • /s matches a single space. • /S matches any single non-space character and is the equivalent of [ᶺs] Searching using regular expressions