Introducing Perl Practical Extraction and Report Language First

  • Slides: 110
Download presentation
Introducing Perl ' Practical Extraction and Report Language ' First developed by Larry Wall,

Introducing Perl ' Practical Extraction and Report Language ' First developed by Larry Wall, a linguist working as a systems administrator for NASA in the late 1980 s, as a way to make report processing easier. ' Since then, it has moved into a large number of roles: automating system administration acting as glue between different computer systems; the language of choice for CGI programming on the Web? ' Free: Source code, documentation, use ' Portable: more than 50 operating system platforms 3/7/2021 1

Introducing Perl G Perl is often called the Swiss Army chainsaw of languages: versatile,

Introducing Perl G Perl is often called the Swiss Army chainsaw of languages: versatile, powerful and adaptable - resembles the Swiss Army knife G Perl is an interpreted language optimised for G Scanning arbitrary text files G extracting information from those text files G Printing reports based on that information G Perl is intended to be practical: easy to use, efficient and complete rather than tiny, elegant and minimal G Perl’s slogan is “There is more than one way to do it” and its philosophy is to “make easy things easy while making difficult things passible” 3/7/2021 2

Introducing Perl A little history: ' ' ' December 1987, release 1. 0 Current

Introducing Perl A little history: ' ' ' December 1987, release 1. 0 Current major release 5. 0 in October 1994 July 1998 release 5. 005 March 1999 release 5. 005_03 March 2000 release 5. 6 Latest version ? ' ' Portable Unix: AIX, HSD, HP-UX, IRIX, Linux, Solaris … MS Windows Mac. OS Others: Amiga. OS, OS 2, VMS. . . 3/7/2021 3

Introducing Perl Programs, Scripts, Compilers, and Interpreters E Perl programs are call “Scripts”. There

Introducing Perl Programs, Scripts, Compilers, and Interpreters E Perl programs are call “Scripts”. There is no difference. It is just source code E Perl “compiler” is also call “interpreter”. E The source code is compiled into bytecode which is executed in the main memory(rather high level bytecode: “sort a list” is one operation) 3/7/2021 4

Introducing Perl Internet References Resources G Home page of Perl: http: //www. pwrl. com

Introducing Perl Internet References Resources G Home page of Perl: http: //www. pwrl. com G Perl user groups: http: //www. perl. org G CPAN(Comprehensive Perl Archive Network) http: //cpan. org Getting Perl G Unix: www. cpan. org/ports G Windows: http: //www. activestate. com/ G Mac. OS: http: //www. macperl. com/ 3/7/2021 5

Writing and Running Perl Programs Writing Perl scripts: ' Code is plain text, any

Writing and Running Perl Programs Writing Perl scripts: ' Code is plain text, any text editor will do ' UNIX: Emacs ' Windows: Notepad Running Perl ' from command line: perl myprog. pl ' with option: perl -w myprog. pl ' in UNIX, make the first line of your program to be #!/usr/bin/perl, and add execute permission using chmod 3/7/2021 6

Documentation and online help ' Extensive documentation comes with the standard distribution UNIX: man

Documentation and online help ' Extensive documentation comes with the standard distribution UNIX: man perl Windows: Programs->Active. Perl->Online Documentation ' Online http: //www. perl. com/pub/v/documentation ' FAQs http: //www. perl. com/pub/v/fags ' The Perl Journal http: //www. tpj. com/ 3/7/2021 7

Hello World! In C #include <stdio. h> main() { printf(“Hello Worldn”); } In Java

Hello World! In C #include <stdio. h> main() { printf(“Hello Worldn”); } In Java public class Hello { public static void main (String [] args) { System. out. println(“Hello World!”); } More than one way to do it: } In Perl print “Hello World!n”; 3/7/2021 print (“Hello World!n”); print “Hello world!”, “n”; print “Hello”, “World!”, “n” 8

Some Simple Scripts/One-liners Example 1: Print lines containing the string ‘Shazzam!’ #/usr/bin/perl while (<STDIN>)

Some Simple Scripts/One-liners Example 1: Print lines containing the string ‘Shazzam!’ #/usr/bin/perl while (<STDIN>) { print if /Shazzam!/ }; – #<STDIN> is a bit of Perl magic that delivers the next line of input each time round the loop Example 2: The same thing the hard way #/usr/bin/perl while ($line = <STDIN>) { print $line if $line =~/Shazzam!/ }; 3/7/2021 9

Some Simple Scripts/One-liners Example 3: A script with arguments #/usr/bin/perl $word = shift; while

Some Simple Scripts/One-liners Example 3: A script with arguments #/usr/bin/perl $word = shift; while (<>) {print if /$word/}; – If we put the script in a file called match, we can invoke the script match Shazzam! Match Shazzam! file 1 file 2 – – – 3/7/2021 The shift operator returns the first argument from the command, and move others up one position. Called with one argument, match reads standard input and prints those lines which contains the word given as the argument. Called with two or more arguments, the first argument is the word to be searched for, and second and subsequent arguments are filenames of fles that will be searched in sequence for the target word. 10

Some Simple Scripts/One-liners Example 4: Error Messages #/usr/bin/perl die “Need word to search forn”

Some Simple Scripts/One-liners Example 4: Error Messages #/usr/bin/perl die “Need word to search forn” if @ARGV ==0; $word = shift; while (<>) {print if /$word/}; – @ ARGV is a special array which holds the command line parameters. A program is executed as a result of a system command, which consists of the executable program file, followed by a command tail, e. g. : C: > program param 1 param 2. . . paramn – Then $ARGV[0] = "program", $ARGV[1] = "param 1", $ARGV[2] = "param 2". . . $ARGV[n] = "paramn". 3/7/2021 11

Some Simple Scripts/One-liners Example 5: Reverse order of lines in a file #/usr/bin/perl open

Some Simple Scripts/One-liners Example 5: Reverse order of lines in a file #/usr/bin/perl open IN, $ARGV[0] or die “Cannot open $ARGV[0]n”; @file = <IN>; for ($I = @file-1; $I >= 0; $I--) { print $file[$I]; } Do the same in C, Java ? 3/7/2021 12

Variables and Datatypes – $ Scalar – @array – %hash – The type of

Variables and Datatypes – $ Scalar – @array – %hash – The type of a variable is marked by the type prefix ($ @ %), which is always used. $x = $y +3 – Variable names are arbitrary long, which can consist of characters a - z, AZ, the underscore _ (and must begin with any of these), and digits from 0 - 9. It is case sensitive: uppercase and lowercase are different. $No_of_Students, @Student. List, %Student. Record_01 – Special control variables have “punctuation” in their names, e. g. , the $^O variable which tells the name of the current operation system. 3/7/2021 13

Variables and Data types (Scalars) – A $scalar holds a single data item. The

Variables and Data types (Scalars) – A $scalar holds a single data item. The data can be string, numeric, boolean depending on context. – When scalars are understood as numbers (that is, used as numbers), they are double precision floating point numbers. – Scalars are given a value by the assignment operator =. For example – – – 3/7/2021 $a $b $c $d $e = = = “The University of Nottingham, UK”; 129867445; 3. 14159; 03776; #octal 0 x 3 fff; #hex 14

Variables and Data types (Aggregates) ' There are two aggregate datatypes, @array and %hash.

Variables and Data types (Aggregates) ' There are two aggregate datatypes, @array and %hash. Both can hold an unlimited number (as long as there is memory) of scalars, there is no explicit declaration, allocation, deallocation, or any explicit memory management @array: – Arrays are ordered and they are indexed by a number (a scalar in numeric context) %hash: – Hashes are unordered and they ar indexed by a string (a scalar in string context) 3/7/2021 15

Variables and Data types (Arrays) – An @array is an aggregate for storing scalars

Variables and Data types (Arrays) – An @array is an aggregate for storing scalars and indexed by a number (a scalar) inside square brackets []. – The indexing is zero-based. – Also negative indices can be used, they count from the end of the array. @a = (“one”, “two”, “three”); print “$a[1] $a[0] $a[-1]n”; – will result in two one three 3/7/2021 16

Variables and Data types (Arrays) – If you enclose the array in double quotes,

Variables and Data types (Arrays) – If you enclose the array in double quotes, the scalars of the array are printed separated by space. @a = (“one”, “two”, “three”); print “@an”; will give one two three – Note that while an array has a type prefix of @ and element of the array is a scalar and therefore has a prefix of $: @a = (“one”, “two”, “three”); $a[3]= “four”; print “@a $a[0]n”; will give 3/7/2021 one two three four one 17

Quoting: Basic – Inside double quotes variables and -constructs (like n) are expanded, inside

Quoting: Basic – Inside double quotes variables and -constructs (like n) are expanded, inside single quotes they are not expanded. – The difference between single quotes and double quotes is that single quotes mean that their contents should be taken literally, while double quotes mean that their contents should be interpreted. print "This stringnshows up on two lines. "; will show: This string shows up on two lines print 'This string n shows up on only one. '; will show: This string n shows up on two lines @a = (“one”, “two”, “three”); print “@an”, ‘@a’, “n”; will show one two three @a 3/7/2021 18

Quoting: Basic – Inside double quotes, the following are the most common -constructs n

Quoting: Basic – Inside double quotes, the following are the most common -constructs n t $ @ x. HH 10 ” \ 3/7/2021 The logical newline The tabulator The dollar The at sign Character encoded in hexadecimal Character encoded in octal The double quotes The backslash itself 19

Quoting: The qw operator – There is a special quoting construct for quoting “words”,

Quoting: The qw operator – There is a special quoting construct for quoting “words”, or strings consisting only of alphanumeric characters. @a = (“one”, “two”, “three”); can be written as @a = qw(one two three); The qw stands for “quote words” Note also that all the quoting constructs are operators. 3/7/2021 20

Variables and Data types (Arrays, $#) – The notation $#arrayname returns the index of

Variables and Data types (Arrays, $#) – The notation $#arrayname returns the index of the farthest array element ever modified. @a = qw(one two three); print “The las index of @a is $#a. n”; will show The last index of @a is 2 3/7/2021 21

Variables and Data types (undef) – What is in the aggregate elements that have

Variables and Data types (undef) – What is in the aggregate elements that have not been assigned a value? The undef value (the value of all uninitialized scalars, not just of uninitialized aggregate elements). – This can be tested using defined, explicitly assigned by using the undef() function, and should be always caught by using -w switch. @a = qw(one two three); if (defined $a[1]) {print “Oh, yesn”}; if (defined $a[9]) {print “Impossible. n”}; Usin the undef() a scalar can be “returned” to an uninitialized state. @a=qw(one two three); undef $a[1]; if (defined $a[1]) {print “Impossible. n”} 3/7/2021 22

Important: -w – The following script will print The tenth element is. And you

Important: -w – The following script will print The tenth element is. And you would waste your time by wondering what went wrong @ = qw (one two three); print “The tenth element is $a[9]. n”; But, by adding the -w switch you would have seen this: Use of uninitialized value …. Use -w is strongly encouraged. – The -w catches not only the use of uninialized values but also other mistakes and problems, such as using a variable only once (usually indicative of a typo) 3/7/2021 23

Scalar Context: Strings or Numbers – Numbers in Perl can be manipulated with the

Scalar Context: Strings or Numbers – Numbers in Perl can be manipulated with the usual mathematical operations: addition, multiplication, division and subtraction. (Multiplication and division are indicated in Perl with the * and / symbols, by the way. ) $a = 5; $b = $a + 10; # $b is now equal to 15. $c = $b * 10; # $c is now equal to 150. $a = $a - 1; # $a is now 4. – You can also use special operators like ++, --, +=, -=, /= and *=. These manipulate a scalar's value without needing two elements in an equation. $a = 5; $a++; # $a is now 6; we added 1 to it. $a += 10; # Now it's 16; we added 10. $a /= 2; # And divided it by 2, so it's 8. 3/7/2021 24

Scalar Context: Strings or Numbers – Strings in Perl don't have quite as much

Scalar Context: Strings or Numbers – Strings in Perl don't have quite as much flexibility. About the only basic operator that you can use on strings is concatenation. The concatenation operator is the period. Concatenation and addition are two different things: $a = "8"; # Note the quotes. $b = $a + "1"; $c = $a. "1"; $a is a string. # "1" is a string too. # But $b and $c have different values! – Remember that Perl converts strings to numbers transparently whenever it's needed, so to get the value of $b, the Perl interpreter converted the two strings "8" and "1" to numbers, then added them. The value of $b is the number 9. However, $c used concatenation, so its value is the string "81". – 3/7/2021 Just remember, the plus sign adds numbers and the period puts strings together. 25

Context: Scalar v. s. List – A very pervasive concept is different context. Certain

Context: Scalar v. s. List – A very pervasive concept is different context. Certain constructs and functions behave differently depending on the context they are used. For example, the context of the left side of the assignment operator (=) forces the right side to comply: @x = qw (adc de f) @a = @x; $a = @x; print “@a: $a n”; will print abc de f: 3 – In scalar context the value of an array is the size of the array ($#array plus one) 3/7/2021 26

Array versus List – The difference between an array and a list is that

Array versus List – The difference between an array and a list is that an array has a name and the @ type prefix, while a list is a parenthesis-enclosed comma-separated entity. In @ x = qw(abs de f); a list is assigned to an array Separate Name Space – The name space of scalars, arrays, and hash are completely separated because the type prefix explicitly tells which one we are talking about $ x = “Tyreytio”; @x = qw(asd df f); 3/7/2021 27

Hashes (I) – A hash is an unordered aggregate which holds scalars, the values

Hashes (I) – A hash is an unordered aggregate which holds scalars, the values of the hash, indexed by strings (scalars), the keys of the hash. The index is enclosed in curly brackets %a = qw ( Nottingham Sheffield Leeds 0115 0114 0113); print “$a{Leeds}n”; will output 0113 3/7/2021 28

Hashes (II) – The keys and the values of a hash can be returned

Hashes (II) – The keys and the values of a hash can be returned by the keys and values functions. %a = qw (Nottingham 0115 Sheffield 0114 Leeds 0113); @k = keys %a; @v = values %a print “@kn@vn”; will possibly (the order is pseudo random) output Nottingham 0115 3/7/2021 0114 Sheffield 0113 Leeds 29

Hashes (III) – The existence of a key-value pair in a hash can be

Hashes (III) – The existence of a key-value pair in a hash can be verified using the exists function. %a = qw( Nottingham 0115 Sheffield 0114 Leeds 0113); $b = exists $a{Leeds} ? 1: 0; $c = exists $a{Birmingham} ? 1: 0; print “$b $c n” will print 1 0 – The exists cares only about the existence of the key: the value is irrelevant. 3/7/2021 30

Hashes (IV) – The key-value pairs of a hash can be returned iteratively (in

Hashes (IV) – The key-value pairs of a hash can be returned iteratively (in a loop) by each function %a = qw(Nottingham 0115 Sheffield 0114 Leeds 0113); while (($k, $v)= each %a){ print “$k $vn”; } – will possibly print (again the order is pseudo random) Nottingham 0115 Sheffield 0114 Leeds 0113 3/7/2021 31

Hashes (V) The => operator – The => operator is a variant of the

Hashes (V) The => operator – The => operator is a variant of the , (comma) operator which as a side effect forces its left operand to be a bare word, effectively a string constant with implicit single quotes around it. This is a convenient notation which is most often used when specifying the key-value pairs for a hash %b = (‘English’, ‘one’); %b = (English => ‘one’); – these are equivalent. 3/7/2021 32

Hashes (VI) – Hash elements or groups of hash elements (slices) can be deleted

Hashes (VI) – Hash elements or groups of hash elements (slices) can be deleted using the delete function %a =(English =>”one”, French => “un”, German => “ein”, Finish => “yksi” Japanese => “ichi”, Chinese “yi”); delete $a{German}; delete $a{‘French’, ‘Finish’}; print values %a, “n”; will print the values one ichi yi in some order 3/7/2021 33

Slices of Aggregates – In addition to accessing the aggregates either as a whole

Slices of Aggregates – In addition to accessing the aggregates either as a whole or per element, it is possible to access them by groups of elements. These groups are called slices. The syntax is @ variable indices, or in other words, @array[number] or @hash{strings}. The slice is the list of scalars at the specified indices. %a =(English =>”one”, French => “un”, German => “ein”, Finish => “yksi” Japanese => “ichi”, Chinese “yi”); @s = @a{“German”, ”English”}; print “@sn”; Should result in: ein one – The order of the returned list is well defined because of the order of the indices is well-defined. 3/7/2021 34

Operators – A fairly standard set of mathematical, logical, and relational operators exists, the

Operators – A fairly standard set of mathematical, logical, and relational operators exists, the only somewhat exceptional one is the power operator **, for example 2**3 is 8. – Scalars can be either numbers or strings depending on the context they are used in - and this is exactly what operators do: they force a contexr on their operands. For example, while the + forces numeric context on its operands and sums them, the. (dot) operator forces string context on its operand concatenates them. will print: 3/7/2021 (B, $c) = (2, 3); $a = $b + $c; $d = $b. $c; print “$a $d n” 5 23 35

Operators: Specialties Perl has these – – – – 3/7/2021 separate sets of comparison

Operators: Specialties Perl has these – – – – 3/7/2021 separate sets of comparison operators for string and numeric context generalized comparison operators cmp and <=> low-precedence and, or and not (&&, || and ! Are high precedence) string concatenator. (dot) and string/list repeater x left-quoting pseudo-comma =>, range generator. . Quoting operators file/directory input operator < > Pattern matching, substitition, and biding operators, m, s, =~, and !~ 36

Operators: Boolean – The && and || are short-circuiting as in C: they stop

Operators: Boolean – The && and || are short-circuiting as in C: they stop evaluating their operands as soon as the first decisive value is met (The first false for &&, the first true for||) – There also variants of lower precedence: and, or, xor and not 3/7/2021 37

Operators: Precedence – Precedence rules are much like in C, C++, or Java –

Operators: Precedence – Precedence rules are much like in C, C++, or Java – Some confusion may stem from the fact that when calling functions (either builtin or user defined), the parentheses are not required. In other words there are equivalent: $n = length ($header); $n = length $header; – This would be easy enough to comprehend, but things gets interesting when the functions have more than one argument, and especially interesting when the number of arguments varies. – When in doubt, parenthesize 3/7/2021 38

Operators: Assignments, Valued and Modifiable – Assignments have values. $a += ($b=$c); This copies

Operators: Assignments, Valued and Modifiable – Assignments have values. $a += ($b=$c); This copies the value of $c to $b and then adds that to $a. – Assignments are modifiable. ($c=$d)+=@e; – This copies the value of $d to $c and then adds the number of elements in @e to $c. – Or, in other words, the left side of an assignment can be used in further assignment (this property is often called “lvalue”, left-value). 3/7/2021 39

Operators: Assignments, List Context – Assignment also works in list context. ($a, $b) =

Operators: Assignments, List Context – Assignment also works in list context. ($a, $b) = ($b, $a); This swaps the values of $a and $b 3/7/2021 40

Operators: Comparing – For comparing scalars with each other and with literal values all

Operators: Comparing – For comparing scalars with each other and with literal values all the usual comparison operators exist. The catch is that there are two sets of comparison operators: One for comparing in string context, one for comparing in numeric context. The cmp and < = > return -1, 0, or 1 depending on whether the comparands fullfil the less-than, equal-to, or greater than relation 3/7/2021 41

Operators: Concatenate and Repeat – Scalars can be concatenated (as string) using the. (dot)

Operators: Concatenate and Repeat – Scalars can be concatenated (as string) using the. (dot) operator $a = “con”; $b = “nate”; $c = $a. “cate”. $b; will result in $c being “concatenate”. – Scalars and lists can be repeated using the x operator. The repetition count comes after the operator $a = “yes”; @b = ($a, “No”); $c = $a x 3; @c = @bx 2; print “@c, $c!n”; Will print: Yes No, No No No 3/7/2021 42

Operators: Range Generator. . – The. . Operator can be used to generate ranges

Operators: Range Generator. . – The. . Operator can be used to generate ranges of values (as lists) between two scalar endpoints. This works both for numbers and strings. The rules ffor “incrementing strings” to generate ranges are as with the ++ operator. @a = 0. . 4; @b = “aa”. . “zz”; print “@a @b[0. . 4] @b[-4. . -1]n”; will result in 0 1 2 3 4 aa ab ac ad ae zv zw zx zz 3/7/2021 43

Operators: Quoting – Single quotes do not extrapolate (do not expand) variables, double quotes

Operators: Quoting – Single quotes do not extrapolate (do not expand) variables, double quotes do. – Double quotes have also several special constructs triggered by using the backquotes (), for eaxmple n for a logical new line and U for uppercasing the value of the following variable. If ($need eq ‘urgent’) else 3/7/2021 { print “U$taskn”} {print “$taskn”} 44

Operators: Here-Documents (I) – A quoting mechanism called here-documents (inherited from UNIX shell scripts)

Operators: Here-Documents (I) – A quoting mechanism called here-documents (inherited from UNIX shell scripts) enables one to easily include multiline blocks of text. – The syntax is <<terminator, (the terminator can be any “word”: alphanumerics and underscores) followed by the lines of text, and terminated by the terminator string at the beginning of a line and alone on the line. $message = <<HERE_IS_THE_MESSAGE; One Two Three HERE_IS_THE_MESSAGE This is equivalent to $ message = “Onen. Twon. Threen”; . 3/7/2021 45

Operators: Here-Documents (II) – The here-document mechanism can be used either in single quoted

Operators: Here-Documents (II) – The here-document mechanism can be used either in single quoted way (variables not extrapolated) or in double quoted way (variable extrapolated) – If there are no quotes around the terminator after the <<, the singlequoted case is implicitly assumed. Explicit single quoteing can be achieved by enclosing the terminator inside single quotes. $x = <<EOF no @expansions EOF $y = <<‘EOF’ no @expansions EOF – After this, $x and $y are equivalently “no @expansionsn” 3/7/2021 46

Operators: Here-Documents (III) – If there are scalars or arrays in the here-block that

Operators: Here-Documents (III) – If there are scalars or arrays in the here-block that need extrapolated, double quotes around the terminator will help. @x = qw (a be def); $x = << “EOH”; This @x is it. EOH – Will have $x equal to “This na bc defnis itn”. 3/7/2021 47

Operators: Input Operator < > – After having opened a file or a directory

Operators: Input Operator < > – After having opened a file or a directory one gets a handle. The “next item” can be read from the handle using the diamond operator < >. Open (X, “File. Name”) or die “$0: failed to open File. Name: $!n”; $line = <X>; – will read the first line of the file called “File. Name” into $line. – The $0 is a special variable that contains the name of the script. The $! Is a special variable that contains the error message caused by the latest failed system call, such as trying to open a file. In numeric context, $! is the numeric error code. – The definition of “next item” for files is the next line and for directories, the next filename. 3/7/2021 48

Control Structures – A very rich selection of flow control structures is available. –

Control Structures – A very rich selection of flow control structures is available. – Control structures control blocks, statements enclosed in { }. If ($a >= $b +$c) { $a = 0; } else { $a++; } 3/7/2021 49

Control Structures: if, unless, elsif, else – The if control structure works as might

Control Structures: if, unless, elsif, else – The if control structure works as might be expected, the following block is executed if the condition is true. If ($x <=100) { …. . } – The unless is the logical oposite of if: the block follwing it is chosen if the condition is false unless ($x>100) {……} if ($x <10) {……) elsif ($x <20) {…. . } elsif ($x <30) {…. } else {…. } 3/7/2021 50

Control Structures: Boolean – If you want to use a “bare” scalar as the

Control Structures: Boolean – If you want to use a “bare” scalar as the condition (in Boolean context), note that there is no need for a separate Boolean type. Any scalar will do as the condition test of the control structures. The rules are as follows: – – – 3/7/2021 Everything is interpreted as a string. Numbers are interpreted as strings undef is interpreted as the empty string “ ” The strings “ 0” and “ ” are false All other strings are true 51

Control Structures: while, until, do – The if and unless are one shot; if

Control Structures: while, until, do – The if and unless are one shot; if you want to keep on executing the block of statements, use the while and until while ($x < 100) {…. . } until ($x >102) { …. } – If you want to test the condition after at least one round of execution has been completed, use the do do {…} while ($x <100); do {…} until ($x >182); 3/7/2021 52

Control Structures: for, foreach – The for can be used as in C to

Control Structures: for, foreach – The for can be used as in C to initialize a state, to test for termination of the loop, and do something at the end of each round for ($i =0; $i <100; $i++) {…. . } – The foreach can be used to iterate over a list of scalars @a = qw(as we qwe); foreach (@a) {print $_, “n”) – what happens here is that the default scalar $_ is aliased in turn to each of the values in the list – You can use some other variable @a = qw(as we qwe); foreach $b (@a) {print $b, “n”) 3/7/2021 53

Control Structures: foreach Alaising – The aliasing that takes place in the foreach when

Control Structures: foreach Alaising – The aliasing that takes place in the foreach when iterating over a list is very real: if you change the loop variable, you change the scalar in the list: @a = qw(abc def ghi); foreach (@a) { if ($_ eq “def”){ $_=“xyz”} print “@an”; – This will output: abc xyz ghi 3/7/2021 54

Control Structures: while (<handle>) – A very common construct is while (<handle>) which means

Control Structures: while (<handle>) – A very common construct is while (<handle>) which means “while there are lines in handle keep reading them into the default scalar, $_” Open (X, “Input”) or die “$0: failed to open Input: $!n”; while (<X>) { print; } close X; – The print without an explicit argument implicitly outputs the default scalar (to the standard output). The above therefore copies (prints) the contents of the file called “Input” to the standard output, and finally close the file. The above loop is shorthand for wile($_= <X>){print $_; } 3/7/2021 55

Subroutines (I) – A subrountine is defined with the sub keyword, and adds a

Subroutines (I) – A subrountine is defined with the sub keyword, and adds a new function to your program's capabilities. When you want to use this new function, you call it by name. For instance, here's a short definition of a sub called boo: sub boo { print "Boo!n"; } boo(); # Eek! 3/7/2021 56

Subroutines (II) – In the same way that Perl's built-in functions can take parameters

Subroutines (II) – In the same way that Perl's built-in functions can take parameters and can return values, your subs can, too. – Whenever you call a sub, any parameters you pass to it are placed in the special array @_. You can also return a single value or a list by using the return keyword. sub multiply{my (@ops) = @_; return $ops[0] * $ops[1]; } for $i (1. . 10) { print "$i squared is ", multiply($i, $i), "n"; } The my indicates that the variables are private to that sub, so that any existing value for the @ops array we're using elsewhere in our program won't get overwritten. 3/7/2021 57

Formats – If you want to produce text reports that have a certain layout,

Formats – If you want to produce text reports that have a certain layout, a feature called formats allows you to create “picture” or templates of the required output formats. The templates contain both the layout and the variables to fill in the layout. printf, sprintf – The most common formatted printing tool is printf, and its string-producing twin, sprintf “%5. 3 f [%-5 s] [%5 s]n”, 1/7, ‘abc, ‘def’; will produce: 0. 143 [abc ] [ def]. $s=sprintf“%5. 3 f print $s; [%-5 s] [%5 s]n”, 1/7, ‘abc, ‘def’; – The sprintf returns the formatted output string. 3/7/2021 58

Formats: Defining (I) – Formats bare defined using the format keyword. Format REPORT =

Formats: Defining (I) – Formats bare defined using the format keyword. Format REPORT = …. . – The REPORT is the name of the filehandle for which the format is to be used. If not specified, STDOUT is assumed. – The format definition ends with a. (dot) alone at the beginning of a line 3/7/2021 59

Formats: Defining (II) Format REPORT = …. . – The actual format definition, between

Formats: Defining (II) Format REPORT = …. . – The actual format definition, between the format line and the terminating dot, consists of three kinds of lines: ' comment lines, which start begin with # ' picture lines, which contain format strings ' argument lines, which the arguments for the formatting strings 3/7/2021 60

Formats: Defining (III) – The picture lines are printed as-is, except for certain special

Formats: Defining (III) – The picture lines are printed as-is, except for certain special strings that begin with a @ (this is not an array) or ^ • • @<<<< to left-align a string @|||| to center a string @>>>> to right-aling a string @#. ## to right-align and format a number – The <<<<, ||||, and >>>> can naturally be as wide as required, not just width of four as shown here. The aligning and centering will be adjusted to match the width of the field. – The usage of the ^-prefixed field is more complex and explained in more detail in Perl formats, perlform 3/7/2021 61

Formats: Defining (IV) – The argument line contain enough variables, separated with commas, to

Formats: Defining (IV) – The argument line contain enough variables, separated with commas, to fill in the picture lines above them @<<<<< @||||| @###. ### $month, $project, $budget_millions 3/7/2021 62

Formats: Using – Using the formats is much easier than defining them. The built-in

Formats: Using – Using the formats is much easier than defining them. The built-in write function fetches the template and the variables to be used and then outputs the filled-in template. – The write can be called with or without a filehandle name. With a filehandle name, the format of the said filehandle is used. Without a filehandle, the format of the currently selected default output handle is used. The default output filehandle is STDOUT, but this can be changed using the select() built-in function. 3/7/2021 63

Formats: Using A simple example using formats: format STDOUT = [@<<<<<<] @#. ### [@>>>>>>>>]

Formats: Using A simple example using formats: format STDOUT = [@<<<<<<] @#. ### [@>>>>>>>>] $x, $y, $z. ($x, $y, $z) = (‘left 2, 1/7, ‘right’); write; This will produce [left 3/7/2021 ] 0. 143 [ right] 64

List Processing – For inspecting and processing lists several useful functions are available. grep

List Processing – For inspecting and processing lists several useful functions are available. grep map sort reverse 3/7/2021 65

List Processing – To find elements of a list that satisfy some criterion, use

List Processing – To find elements of a list that satisfy some criterion, use the grep function. It has two slightly different syntax, which work the same way. The two statements below compute the same result. @b = grep b($_> $max, @a); @c = grep {$_>$max} @a; #Note: no comma after the block – Inside the expression or the block the default scalar $_ is aliased to each element of the list (here, an array) in turn. The expression or the expressions in the block are evaluated, and if the last value is true, the element value is returned. @ = (31, 14, 15, 92, 65, 35, 89, 79); @b = grep {$_ > 50} @a; print “@bn”; This will results in 92 65 89 79 3/7/2021 66

Aside: The command Line argument: @ARGV – This will echo those command line arguments

Aside: The command Line argument: @ARGV – This will echo those command line arguments that are greater than the first argument. my $I = shift; foreach (@_) {print “$_n” if $_>$I} – Outside any subroutine definitions the shift without an explicit array argument will process the @ARGV array, the arguments of the program. 3/7/2021 67

List Processing : map-Transform – To get a copy of a list with the

List Processing : map-Transform – To get a copy of a list with the elements transformed through some mapping, use the map function. It has, like grep, two forms, with a block (and no comma) and with an expression (and a comma). The last value will be returned for each element, and from those values a new list can be constructed. @ = qw(this little piggie went to the market); @b = map {length} @a; print “@bn”; – This will print 4 6 6 4 2 3 6 – The length without an argument works on the $_. The $_ is again an alias, so if you modify it you’ll modify the elements of the original list. 3/7/2021 68

List Processing : foreach Transform – If you want to modify (distructively) the elements

List Processing : foreach Transform – If you want to modify (distructively) the elements of a list, don’t wast map on it, use foreach @ = 1. . 5; for each (@a) {$_**=$_} print “@an”; – This will output 1 4 27 256 3125 3/7/2021 69

List Processing : sort – One of the most common tasks in computing is

List Processing : sort – One of the most common tasks in computing is arranging information, sorting. It is so important there is a highly tuned and highly tunable built-in function for it, sort. It has three alternate syntax @b = sort subname @b = sort { …. } B= sort @a @a; – The first two are equivalent, they carry the comparison as their first argument, either as a name of a subroutine or as an inlined block. – The third one has an implied sorting order: stringwise comparison. 3/7/2021 70

List Processing : sort – The subroutine or the block get “magically” (meaning that

List Processing : sort – The subroutine or the block get “magically” (meaning that you don’t have to care how they appear in there) passed two arguments called $a and $b. The subroutine or the block must then compare them, and finally returned something less than zero, a zero, or something greater than zero, depending on how the comparands are ordered. – Examples: – – – @b = sort {$a <=> $b} @a; #Sort as numbers. @c = sort {$b<=>$a} @a ; #sort as descending numbers @d = sort {lc($a) cmp lc($b)} @a; #Sort case insensiively #sort keys by numeric values. @e = sort {$d{$a} <=> $d{$b}} keys %a; – The $a and $b are passed as aliases, so don’t modify them. 3/7/2021 71

List Processing : reverse – A list can be reversed by the reverse function.

List Processing : reverse – A list can be reversed by the reverse function. @a = qw(function reverse the by reversed be can list A); @b = reverse @a; print “@b. n”; – Will print: A list can be reversed by the reverse function. 3/7/2021 72

String Processing: index, rindex – To find a substring from a string, use the

String Processing: index, rindex – To find a substring from a string, use the index function. It returns the offset of the substring (more precisely, the offset of the first occurrence). If the substring cannot be found, -1 is returned. print index(“foobar”, “bar”), “n”; print index(“foobar”, “baz”), “n”; This will print: 3 -1 – The index doesn’t do any pattern matching, just literal strings. To find the last occurrence of a substring in a string, use the rindex function. 3/7/2021 73

String Processing: substr – To extract and modify substrings by position and length the

String Processing: substr – To extract and modify substrings by position and length the substr function is available $a = “I would like to by a camel, please. n”; print “I like “, substr($a, 22, 5), “s, of course. n”; substr {$a, 22, 5) = “llama”; print “But $an”; This will output I like camels, of course. But I would like to buy a llama, please. – The offset and length parameter can be negative, which means from the end of the string. 3/7/2021 74

String Processing: chomp – The < > operator leaves the newlines to lines it

String Processing: chomp – The < > operator leaves the newlines to lines it returns. To remove the newlines use the chop operator: while (STDIN>) { chomp; #Operate on $_ print STDERR “[$_]n”; } – This will remove the possible newline, and then print out the reformated lines to the standard error stream. 3/7/2021 75

String Processing: split – With the split you can cut a string into a

String Processing: split – With the split you can cut a string into a list. The first argument of split is the separator pattern. The second, optional, argument is the string to be split. @b = split(/, /, $a); @c = split /: /; – Because the first argument is a pattern, full understanding of the split requires understanding of patterns, and therefore we will revisit split later. There is a commonly used special case, however, using ‘ ‘ as the separator. @b = split(‘ ‘, @a); – This will split @a by any number of ant whitespace characters and any leading and tailing empty fields (resulting from leading or tailing whitespace) will be dropped (Not unlike the qw operator). 3/7/2021 76

String Processing: join – You can combine several scalar strings into a single string

String Processing: join – You can combine several scalar strings into a single string by using the join function. $record = join (“, “, @field); – Note: split and join are NOT symmetrical because the separator of split is a pattern, not a literal string, as the separator of join is. 3/7/2021 77

String Processing: reverse – The reverse function applied on a scalar produces a scalar

String Processing: reverse – The reverse function applied on a scalar produces a scalar reversed characterwise. @b = reverse (“yes or no. ”); print $b, “n”; This will print. on ro sye 3/7/2021 78

Pattern Matching – Perl has very powerful pattern matching capabilities. – The basis of

Pattern Matching – Perl has very powerful pattern matching capabilities. – The basis of pattern matching are the regular expressions. – Regular expressions are a language of their own within Perl. 3/7/2021 79

Matching: The m operator – The m operator is used to match a pattern

Matching: The m operator – The m operator is used to match a pattern against a piece of text, a scalar. It returns true if a match can be found, false if not. – A strange thing about the m operator is that the m is quite often not written at all, but instead written like this /pattern/ – That is, the pattern between forward slashes. The pattern can contain even whitespaces, and the whole thing is still understood as a pattern matching operator. – To better understand how the m operator works and how it can be expressed without the m, we need to take another look at the quoting rules. 3/7/2021 80

Aside: Generalized Quoting (I) – The ‘single quotes’, “double quotes”, and ‘backquotes’, are in

Aside: Generalized Quoting (I) – The ‘single quotes’, “double quotes”, and ‘backquotes’, are in fact just “syntactic sugar” for the real quoting operators, q, qq, and qx. – The syntax of the quoting operators is very flexible. Because which characters should delimit the quote depends on the contents in it (you do not want to use as a delimiter a character that appears in the quote, unless you are fond of using backslashes). The delimiters are selected “dynamically” based on which character follows the quoting operator. Either the delimiters can be of the paired kind (like the parentheses), or they can be the same character (like the usual quotes). Examples: $a = q(this is a single-quoted); $b= qq[this is double quoted, @variables expand]; $c = qx ! Echo this goes to the operating system!; – The qw we have met earlier is another example of the quoting operators, a quoting construct produces a list. 3/7/2021 81

Aside: Generalized Quoting (II) – The matching operator m has one double-quotish argument. m

Aside: Generalized Quoting (II) – The matching operator m has one double-quotish argument. m {pattern} m /pattern/ – The substitution operator s has two double-quotish arguments s /pattern/substitute/ s (pattern)(substitute) 3/7/2021 82

Matching: Binding – The string to be matched against is either the default scalar

Matching: Binding – The string to be matched against is either the default scalar $_ or it can be something else, by using the binding operator =~ or its logical oposite, the !~. – By using the default scalar and leaving out the m of m, we arriuve at the very common idiom if (/pattern/) {……} – written out in full that would be if ($_ =~ m/pattern/) {……} 3/7/2021 83

Matching: The Basic (I) – The basic set of regular expression matching operators is

Matching: The Basic (I) – The basic set of regular expression matching operators is the triad. any character * repeat | alternative – With these and the parentheses for grouping, you can already construct perfectly fine regular experessions. – Alphanumeric characters like a, b, 1, 2, match themselves – The. Does not really match any character: rather, it matches any non-newline character. 3/7/2021 84

Matching: The Basic (II) – Regular expressions matches sub-strings, not whole strings, unless anchored

Matching: The Basic (II) – Regular expressions matches sub-strings, not whole strings, unless anchored to do otherwise. – In other words, the pattern need not match all of the target string. For example, /q/ will match the string “sesquipedalian” just fine. – Neither do regular expressions care about “words” (unless instruct to), the input string is just a flat string of characters with no implicit tokenization. 3/7/2021 85

Matching: The Basic (III) Some examples: /apple/ #matches ‘apple’ /bana*/ #matches “ban”, “banaa”, “banaaaa”,

Matching: The Basic (III) Some examples: /apple/ #matches ‘apple’ /bana*/ #matches “ban”, “banaa”, “banaaaa”, …. /ba(na)*/ #matches “ba”, “banana”, “bananana”, …. /(orange|cheery)/ # matches “orange” or “cherry” /(pear|plum)*/ #matches e. g. “pear”, or #“plumpear”, …. – Beware of further punctuation characters: many of them have special meanings in regular expressions, collectively they are called metacharacters. If you want to match a punctuation character as itself, protect it by prefixing it with a backslash: * matches the literal * 3/7/2021 86

Matching: The Anchors – If you want to match only at the beginning of

Matching: The Anchors – If you want to match only at the beginning of the target string, or only at the end, or at both (this is what you might call “match whole words”), use the anchoring meta-characters. ^ $ beginning end – For example /^abc/ will match abc only at the beginning of the string. The $ matches either at the end of the string (if there is no newline at the end) or before the last newline, if that is the last character of the string. If you want to match for a new line, use /n. – Again, if if you need to have these character matched literally, backslash them: ^, $ – Anchors are the most often used assertions: they do not themselves extend the match but they test for the validity of a (set of) conditions. 3/7/2021 87

Matching: Character Classes – You could use the alternation, |, for single characters, but

Matching: Character Classes – You could use the alternation, |, for single characters, but that would be awefuuly inefficient. If you have a known set of single characters, use character classes. – Enclose the set of characters inside square brackets, [ ], and you are done. Inside those brackets the metacharacters lose their specialness, for example the dot, . , is just a dot, not any more matche-any-character. For example [ab*] will match any of the a, b, or *. – If you need to match “any character except these”, you can negate the character class by putting ^ as the first thing after the opening bracket, for example [^abc] will match any character that is not a, b, or c. 3/7/2021 88

Matching: Built-In Character Classes – For certain commonly used character classes and their negations

Matching: Built-In Character Classes – For certain commonly used character classes and their negations there are shorthands available. w d s W D S any any any alphanumeric or the underscore decimal digit space character non-w non-d non-s – These can be used either inside or outside the [] character classes. – Note that the definition of w(and W) depend on you locale. If you want, say, to be understood as a w character, you need to have you locale correctly set up and you must say use locale, at the top of your script. (The locale system is a UNIX only feature) 3/7/2021 89

Matching: Further Repetition – With the basic repetition operator * and alternating (|) with

Matching: Further Repetition – With the basic repetition operator * and alternating (|) with an empty string you can construct repetitions of zero or more, one or more, …, but the expression can get rather ugly. Use instead the following handy postfix notations: ? + {} 3/7/2021 Not at all or once Once or more {n}, {min, max} 90

Matching: Capturing Sub-matches – Often you will want to save parts of the match

Matching: Capturing Sub-matches – Often you will want to save parts of the match to process them further. – Parentheses let you do that. – Each sub-match corresponding to a parenthesized sub-pattern is saved a way to a special variable baned $1, $2, and so on. “thx 1138”=~/([a-z]+) - (d+)/ – After this: $1 = “thx”, and $2 = “ 1138” 3/7/2021 91

Matching: Word Boundary Assertions – If you need to match at word boundaries, in

Matching: Word Boundary Assertions – If you need to match at word boundaries, in other words, at places where a “word”, a string of alphanumeric characters begins or ends, you can use the word boundary assertion b. “to be or not to be” = ~/b(. . t)b/; – This will save “not” to the $1 because that is the only place where word boundaries are three positions apart and there is an t at the third mid-position. – If you want to match at a non-boundary, use B. 3/7/2021 92

Matching: Not so fast: Stingy Matching – The default behaviour of a regular expression

Matching: Not so fast: Stingy Matching – The default behaviour of a regular expression is to gobble up as much as possible, that is, as long a match as possible. In other words, each repetition operator extends as far as possible. This is called greedy matching. – Sometimes this is not what you want. If you want to stop as soon as possible, you can use stingy matching. Changing the repetition operators to stingy is as simple as appending a ? to them: *? , ? ? , +? . Example: “(a, b) (c, d)” = ~ / ((. *? )) / – This will set $1 = ‘(a, b)’, not ‘(a, b) (c, d)’. The * will stop at the first ) instead of continuing to match all the way until the last possible ) at the end of the string. 3/7/2021 93

Matching: Case (In)Sensitiveness – Often the case of letters may different from what you

Matching: Case (In)Sensitiveness – Often the case of letters may different from what you would expect. The matching operator can have modifies after the “closing quote” that affect its behaviour. One possible effect is to ignore any case differences. For example so that Perl and perl are considered equal, the modifier for this is i. – This /pc/i – will match pc, Pc, p. C and PC 3/7/2021 94

Matching: Non-Capturing Sub-matches – Sometimes you need to use parentheses just for grouping (to

Matching: Non-Capturing Sub-matches – Sometimes you need to use parentheses just for grouping (to use the | alternation, for example) but you do not want to collect the sub-matches. The (? : …) construct is for this situation: it is exactly like the ordinary parentheses but the sub-matches are not recorded into the $1 etc. While (<>) { print if /^(? : de|in|im|un)/; } – This is not set $1. – Note also the special < > structure. Its meaning is: “iterate over all command line arguments (@ARGV), interpret them as filenames, open the files, and iterate over all the lines in those files”. 3/7/2021 95

Matching: I want them all: Global Match – Normally a regular expression matches only

Matching: I want them all: Global Match – Normally a regular expression matches only once, returning the first successful match. – However, if you want to keep matching, you can do that using the “global” modifier, /g. The matching remembers the position it was in a scalar which means that you can use the /g match as the control condition of a repetitive control structure such as while (/(d+)/g) { print $1, “n”; } – In addition to the usual Boolean (scalar) beharviour of matching operator, the /g modifier introduces a new behaviour: in list context all the posiible matches are returned. @m = /(d+)/g; – This will return all the numbers of $_ and assign them to @m. 3/7/2021 96

Matching: Iterative Matching – If you need to match exactly where the previous /g

Matching: Iterative Matching – If you need to match exactly where the previous /g (if any) left off, you can use the G assertion. while (/ G(d+). /g){ print $1, “n”; } – This will match only as long as the numbers are separated by one character. The offset of a match can be returned (and forged!) with the pos function. 3/7/2021 97

Matching: Multi-line Matching – Normally, regular expression match only within one line, but it

Matching: Multi-line Matching – Normally, regular expression match only within one line, but it is possible to match over multiple lines. – First you need to get a “multi-line” scalar. If you are reading your text to be matched against from file, the easiest way to do that is to set the special variable $/ as necessary. This variable controls what is a “line”. If you set it to ‘’ (an empty string), the whole paragraph (text block separated by empty lines) are read. If you set it to undef, the whole file is read as one big scalar. You will most probably want to use local to localize your change of /$. – Secondly, you need to decide what do you actually mean by multi-line matching. Do you mean that the dot (. ) should match also newlines? Or do you mean that ^ and $ should match at the “internal” new lines, not just at the string beginning and end? – For these two interpretations of “multi-line”, two different modifiers are available: /s and /m. They can be used simultaneously, as is usual with modifier. : /gims is perfectly fine. – /m Treat target string containing newline characters a multiple lines. In this case, the anchor ^ and $ are the start and end of a line: A and Z anchor to the start and end of string, respectively – /s Treat a target string containing newline characters as a single string: I. e. dot matches any character including newline. 3/7/2021 98

Matching: Multi-Line Assertions – Because the /m changes definitions of ^ and $ but

Matching: Multi-Line Assertions – Because the /m changes definitions of ^ and $ but you might still need the old definitions, they are available via alternative assertions: A beginning of the string Z end of the string, or before newline at the end z end of the string – The first two are like the old ^ and $, the last one is like (Z|n) 3/7/2021 99

Matching: split Revisited – Now we know enough of matching to review split –

Matching: split Revisited – Now we know enough of matching to review split – The first argument of split is a pattern, like the operand of the m operator. For example: @f = split (/, s*/, $s); – will split on a comma, followed by any amount of white space. – Normally the delimiters corresponding to the pattern are thrown away, but if you want to include the delimiters, surround the pattern with parentheses: $s = “foo: bar; zap: foo”; print join (“, “, split(/([: ; ])/, $s)), “n”; – This will split on colons and semicolons, but it will also return those delimiters as list elements: foo, : , bar, ; , zap, : , foo. 3/7/2021 100

Matching: Limits of regular expressions – Though regular expressions are powerful, there are some

Matching: Limits of regular expressions – Though regular expressions are powerful, there are some (seemingly) simple tasks they cannot do. – Most importantly, they cannot match “balanced expressions”. For example ordinary mathematical expressions, which can have arbitrarily nested structures, cannot be matched using regular expressions – To do that kind of paring, more than just patterns is needed: the context (for example, how deep are we right now, and inside which structure) need to be known, and regular expressions do not give that. 3/7/2021 101

Substitution: The s operator – The substitution operator s operator has two quoted arguments:

Substitution: The s operator – The substitution operator s operator has two quoted arguments: the pattern and the substitution. The pattern part is exactly as with the m operator, the substitution is a double-quotish string in which you can use sub-matches like $1 from the pattern side. – At the pattern side you can also use constructs like 1, 2, and so on, that refer to the same sub-matches as $1, $2. The n refer to the on going match, while the $n would refer to the sub-matches of the previous match. Do not use the n constructs outside the pattern side of s. – For example: s/(gold|silver|platinum)/precious metal/; – This will replace in $_ the first occurrence of either gold, silver or platinum with precious metal. 3/7/2021 102

Substitution: Global Substitution – For the substitution operator the “global” modifier /g means to

Substitution: Global Substitution – For the substitution operator the “global” modifier /g means to substitute all occurrences of the pattern with the substitution, not just the first one. s/(gold|silver|platinum)/precious metal/g; – This will replace in $_ all occurrences of either gold, silver, or platinum with precious metal. – You will probably want to use b here to avoid false substitutions like precious metalfish and quickprecious metals. 3/7/2021 103

Substitution: Evaluating Substitution – The substitution side of the s operator is already in

Substitution: Evaluating Substitution – The substitution side of the s operator is already in double-quotish context so all variables are extrapolated, but sometimes you need to evaluate a piece of code. For this you can use the /e modifier s/(w+)/reverse($1)/eg; – This will replace all “words” with their reverses – The /e even stacks: you can have arbitrarily many of them: /eeee 3/7/2021 104

Translation (I) – If you want to map characters, you can use the tr

Translation (I) – If you want to map characters, you can use the tr operator – Like the s operator, the tr has two quoted arguments, but unlike with the s, both of the arguments of the tr are single-quotish. They also are not regular expressions. – The first argument is called search list, and the second argument is the replacement list. – Both can have a range by simply putting a - between two characters. 3/7/2021 105

Translation (II) – To complete the search list, add the /c modifier. – If

Translation (II) – To complete the search list, add the /c modifier. – If the translated character is not found in the replacement list (because the replacement list is too short), and the /d modifier is used, the translated character is deleted from the result. If the /s modifier is used, duplicated translated characters squashed into one. #A Caesar cipher, a. k. a. rot 13, in $_ tr/A-MN-Za-mn-z/N-ZA-Mn-za-m/; $a=~tr/ //s; #squash multiple spaces into one in $a. tr/0 -9/ /d; #Replace all non-digits with space in $_ – You can count the number of times a character occurs by translating it ito itself, because the tr returns the number of translations made. $bangs = ($text = ~/tr/!/!/); 3/7/2021 106

File I/O (Section 6 of Notes) To read from or write to a file,

File I/O (Section 6 of Notes) To read from or write to a file, you have to open it. When you open a file, Perl asks the operating system if the file can be accessed - does the file exist if you're trying to read it (or can it be created if you're trying to create a new file), and do you have the necessary file permissions to do what you want? If you're allowed to use the file, the operating system will prepare it for you, and Perl will give you a filehandle. The following statement opens the file log. txt using the filehandle LOGFILE: open (LOGFILE, "log. txt"); 3/7/2021 107

File I/O (Section 6 of Notes) open (LOGFILE, "log. txt")  or die "I

File I/O (Section 6 of Notes) open (LOGFILE, "log. txt") or die "I couldn't get at log. txt"; $title = <LOGFILE>; print "Report Title: $title"; for $line (<LOGFILE>) { print $line; } close LOGFILE; 3/7/2021 108

File I/O (Section 6 of Notes) Writing files You also use open() when you

File I/O (Section 6 of Notes) Writing files You also use open() when you are writing to a file. There are two ways to open a file for writing: overwrite and append. To indicate that you want a filehandle for writing, you put a single > character before the filename you want to use. open (OVERWRITE, ">overwrite. txt") or die "$! error trying to overwrite"; # The original contents are gone. This opens the file in overwrite mode. 3/7/2021 109

File I/O (Section 6 of Notes) To open it in append mode, use two

File I/O (Section 6 of Notes) To open it in append mode, use two > characters. open (APPEND, ">>append. txt") or die "$! error trying to append"; # Original contents still there, #we're adding to the end of the file Once our filehandle is open, we can use the humble print statement to write to it. Specify the filehandle you want to write to and a list of values you want to write: print OVERWRITE "This is the new content. n"; print APPEND "We're adding to the end here. n”; print APPEND "And here too. n"; 3/7/2021 110