REGULAR EXPRESSION IN PERL PART 1 Thach Nguyen
- Slides: 15
REGULAR EXPRESSION IN PERL (PART 1) Thach Nguyen
OBJECTIVE What is Regular Expression? How to use Regular Expression in Perl � Basic tools Simple word matching Using character classes Matching this or that … � Power tools
WHAT IS REGULAR EXPRESSION (REGEX, REGEXP)? Big factor behind the fame of Perl A string that describe a pattern Examples of pattern: Search engine to find webpage (Google) List files in directory (ls *. txt, dir *. *) Search, extract parts of strings, search and replace (Microsoft Word) Efficient, flexible to manipulate text Not really difficult to understand as reputation Constructed using simple concepts (conditional, loop) If getting used to terse notation of them, you’re good to go
HOW TO USE REGEX Part 1: basics (solve about 98% of your needs) Simple word matching Using character classes Matching this or that Part 2: power tools (for the rest) Advanced regex operators Latest innovation
PART 1: THE BASICS Simple word matching The simplest regex: a word, a string of characters Match any string that contains that word Eg: Result: It matches
PART 1: THE BASICS Simple word matching Operator =~ : return true if the regex matched !~ : return true if doesn’t match / … / : delimiter to enclose the string/variable of string needed to search Eg: $greeting = “World”; if (“Hello World” =~ /$greeting/) { … } Other arbitrary delimiters:
PART 1: THE BASICS Simple word matching – Additional Can If use the default variable $_ , the omit “$_ =~ ” part Eg: $_ = “Hello World”; If (/World/) { … } regex matches in > 1 place: the earliest point is matched Eg: "Hello World" =~ /o/; # matches 'o' in 'Hello‘
PART 1: THE BASICS Simple word matching – Special characters metacharacters: Use backslash to include Escape Sequences ASCII characters (n, t. etc), arbitrary bytes (octal, hexa ) Variables: {}[]()^$. |*+? substituted before matching Eg: $foo = ‘house’; 'cathouse' =~ /cat$foo/; # matches
PART 1: THE BASICS Simple word matching – Special characters Anchor metacharacters: ^ and $ , to match the beginning and the end of string Overall: it’s just the surface of regex technology
PART 1: THE BASICS Using A character classes: set of possible characters To match the whole class at particular point in the regex Denoted by brackets [ … ] Eg: /item[0123456789]/; # matches 'item 0' or. . . or 'item 9' "abc" =~ /[cab]/; # matches 'a‘ To match 'yes' in a case-insensitive way (yes, YES): /[y. Y][e. E][s. S]/ /yes/i (i : case-insensitive, modifier of matching operation)
PART 1: THE BASICS Using character classes – Special characters: -]^$ Needed a backslash to represent ] The end of a character class $ Scalar variable Escape sequences - Range operator within character class ^ Negated character class
PART 1: THE BASICS Using character classes – Special characters: Several d abbreviations for common character classes a digit and represents [0 -9] s whitespace character, represents [ trnf] D negated d S negated s W negated w . any character but "n" b matches a boundary between a word character and a non-word character wW or Ww
PART 1: THE BASICS Issues: why '. ' matches everything but "n“? We would like to ignore the newline characters, empty when counting and matching on the line If we want to keep track of newlines: anchor ^ $, modifiers /…/s (single line) and /…/m (multiple line) No modifier // ‘. ’ match any character except ‘n’ ^, $: just match the beginning and end of string, before a newline S modifier //s Treat string as a single long line ‘. ’ match any character, ^ and $ just match the beginning and end of string before a newline M modifier //m Treat string as a set of multiple lines ‘. ’ match any character except ‘n’ ^ and $ match at the start or end of any line in string Both //sm Treat string as a single line, but detect multiple lines ‘. ’ match any character ^ and $ match the start and end of any line within the string
PART 1: THE BASICS Matching this or that: Able to match different possible words or strings Using alternation metacharacter | Eg: "cats and dogs" =~ /dog|cat|bird/; "cats" =~ /cats|cat|ca|c/; # matches "cat“ # matches "cats"
QUESTION
- Thach nguyen course
- Perl defined 廃止
- Annie thach
- Khoi nguyen education group
- Regular grammars generate regular languages
- Parser generator python
- Regular expression
- Regular expressions
- Global regular expression print
- Worgle
- Fsmtoregexheuristic
- Regular expression to nfa
- Regular expression recursive definition
- Social stratification slideshare
- Ethical hacking site: drive.google.com
- As path access list regular expression