Regular Expressions A regular expression is a pattern

  • Slides: 19
Download presentation
Regular Expressions • A regular expression is a pattern that defines a string or

Regular Expressions • A regular expression is a pattern that defines a string or portion thereof. When comparing this pattern against a string, it'll either be true or false. If true, it'll return something. • The return value will depend on the specific function used and its attributes.

Regular Expressions: Basics • Basics of the function – REFind(reg_expression, string [, start] [,

Regular Expressions: Basics • Basics of the function – REFind(reg_expression, string [, start] [, return_sub]) • compares a Regular Expression to a string and if it matches all or part of the string, returns the numeric position in the string where the match starts. • The optional start position allows the search to start anywhere in the string. • An additional option is to return sub expressions. • We'll deal with that a little later.

Regular Expressions: Basics • Any Ascii character which is not a special character matches

Regular Expressions: Basics • Any Ascii character which is not a special character matches itself. – A matches A – b matches b • A does not match a unless the No. Case version of the function is used – REFind. No. Case() • This is slightly slower, but only someone totally anal about run times will be able to tell you how much slower. : )

Regular Expressions: Basics • REFind. No. Case('is', 'This is a test') = 3 •

Regular Expressions: Basics • REFind. No. Case('is', 'This is a test') = 3 • REFind. No. Case(‘This', 'This is a test') = 1 • REFind (‘t', 'This is a test') = 11

Regular Expressions: Special Characters • A period (. ) matches any single character •

Regular Expressions: Special Characters • A period (. ) matches any single character • A pipe (|) means either what comes before it or what comes after it. • A caret (^) at the beginning of a Reg. Ex means that the regex will only match if it starts at the beginning of the comparison string • A dollar sign ($) at the end of a Reg. Ex means that the regex will only match if it ends at the end of the comparison string

Regular Expressions: Special Characters • A backslash () means escape the next character if

Regular Expressions: Special Characters • A backslash () means escape the next character if it is a special one • If the character after the backslash is not a special one, then it may be an escape sequence • Displaying a backslash () is done by escaping it

Regular Expressions: Special Characters • • • REFind. No. Case(‘i. ', 'This is a

Regular Expressions: Special Characters • • • REFind. No. Case(‘i. ', 'This is a test') = 3 REFind. No. Case(‘^is', 'This is a test') = 0 REFind. No. Case(‘^t', 'This is a test') = 1 REFind. No. Case(‘t$', 'This is a test') = 14 REFind. No. Case(‘th|te', 'This is a test') = 1 REFind (‘th|te', 'This is a test') = 11

Regular Expressions: Escape Sequences • When certain non-special characters have a backslash () before

Regular Expressions: Escape Sequences • When certain non-special characters have a backslash () before them, they become special. REFind. No. Case(‘d’, ‘this is 4’) = 9 d means any number REFind. No. Case(‘is d’, ‘this is 4’) = 6

Regular Expressions: Sets • A character set is a group of characters from which

Regular Expressions: Sets • A character set is a group of characters from which only one is desired [0123456789] – matches any single number Sets can use ranges of characters (think ascii table) [0 -9] – matches any single character A dash can be represented in a set by placing it first (I. e. not in a range) [-aeiou] – matches a dash or a vowel A Carat (^) at the beginning of a set negates if (I. e. anything BUT characters in the set

Regular Expressions: Sets • • REFind. No. Case(‘[AEIOU]’, ‘This is a test’) = 3

Regular Expressions: Sets • • REFind. No. Case(‘[AEIOU]’, ‘This is a test’) = 3 REFind. No. Case(‘[0 -9]’, ‘this is a test’) = 0 REFind. No. Case(‘[0 -9]’, ‘this is a 4 th test’) = 11 REFind. No. Case(‘[-0 -9]’, ‘this-is a test’) = 5 REFind. No. Case(‘[^0 -9]’, ‘this is a test’) = 1 REFind. No. Case(‘[^-]’, ‘this is a test’) = 1 REFind. No. Case(‘[-^]’, ‘this-is a ^’) = 5

Regular Expressions: Sets • Cold. Fusion also includes a number of predefined sets: •

Regular Expressions: Sets • Cold. Fusion also includes a number of predefined sets: • A predefined set is called using a special name surrounded by colons – : alpha: Used within a set, it would look like REFind. No. Case(‘[: alpha: ]’, ‘ 123 abc’) = 4 Can be combined with other characters in a set REFind. No. Case(‘[123[: alpha: ]]’, ‘ 123 abc’) = 1

Regular Expressions: Groups • A group allows a portion of a regular expression to

Regular Expressions: Groups • A group allows a portion of a regular expression to be separated from another portion • Also known as subexpressions • Uses parenthesis to group things together REFind. No. Case(‘(this|that): ’, ‘find this: ’) = 6 More uses later

Regular Expressions: Modifiers • A modifier will take the previous character, set or group

Regular Expressions: Modifiers • A modifier will take the previous character, set or group and say how many times it can or should exits. REFind. No. Case(‘ha+’, ‘hahaha’) = 1 REFind. No. Case(‘ha*’, ‘hhaha’) = 1 REFind. No. Case(‘ha? ’, ‘hahaha’) = 1 REFind. No. Case(‘ha{2}’, ‘hahaaha’) = 3 REFind. No. Case(‘ha{2, 3}’, ‘hahaha’) = 3 REFind. No. Case(‘ha+{3, }’, ‘hahaha’) = 0 REFind. No. Case(‘(ha)+’, ‘hahaha’) = 1

Regular Expressions: Modifiers • Normal modifiers are greedy, I. e. they want to match

Regular Expressions: Modifiers • Normal modifiers are greedy, I. e. they want to match as much as they can. • Using a question mark (? ) after a modifier makes it lazy, I. e. it will match as little as possible • REFind. No. Case('a+', 'baaaa', 1, 1) • will return aaaa • REFind. No. Case('a+? ', 'baaaa', 1, 1) • will return a

Regular Expressions: Line Modifiers A line modifier changes how a Regular Expression is processed

Regular Expressions: Line Modifiers A line modifier changes how a Regular Expression is processed REFind (‘(? i)This’, ‘this’) = 1 (? i) means perform a case insensitive search REFind. No. Case('(? x)is a', 'this is a isa') = 11 (? x) means perform a search ignoring spaces REFind. No. Case('(? m)^line 3', ‘line 1 Line 2 line 3)') = 13 (? m) means pay attention to the lines

Regular Expressions: Returning Structures Rather than returning a number, a Regular Expression function can

Regular Expressions: Returning Structures Rather than returning a number, a Regular Expression function can be set to return a structure The structure will contain 2 keys names pos and len Each will contain a matching array holding the start position of a match and its length The first item always contains the entire match and all others contains matches from sub expressions Use the mid() function to get easy access to the return data

Regular Expressions: Returning Structures The start location must be specified and the 4 th

Regular Expressions: Returning Structures The start location must be specified and the 4 th attribute must be set to yes (1, true) String= ‘this is a finder’ Test=REFind. No. Case(‘f[aeiou]n. ’, string, 1, 1) Mid(string, test. pos[1], test. len[1])=“Find” Test=REFind. No. Case(‘f([aeiou])n. ’, string, 1, 1) Mid(string, test. pos[1], test. len[1])=“Find” Mid(string, test. pos[2], test. len[2])=“i”

Regular Expressions: Replacing REReplace(string, regex, replace, scope) Replaces the regex match in the string

Regular Expressions: Replacing REReplace(string, regex, replace, scope) Replaces the regex match in the string with the replace value Scope is one(default) or all Show lots of examples here on in

Regular Expressions: Replacing New in MX is the ability to modify the replace values

Regular Expressions: Replacing New in MX is the ability to modify the replace values using special escape codes REReplace. No. Case(‘make upper’, ‘u. +’, ‘u1’) Upper REReplace. No. Case(‘make upper’, ‘u. +’, ‘U1’) UPPER