awk awk Alfred Aho Peter Weinberger Brian Kernighan
awk
awk? • Alfred Aho, Peter Weinberger, Brian Kernighan Bells Labs 1977
You think awk is bad? Winner of the International Obfuscated C Code (one liner): main(int c, char**v){return!m(v[1], v[2]); }m(char*s, char*t) {return*t-42? *s? 63==*t|*s==*t&&m(s+1, t+1): !*t: m(s, t+1)||*s&&m(s+1, t); }
Awk variants § awk - original from AT&T - 1977 § nawk - A newer & improved (AT&T) - 1993 § gawk - The Free Software Foundation - 1985 -88
Why awk? • Excellent filter and report writer. • Processing these rows and columns • Easier to use AWK than most conventional programming languages. • Considered as a pseudo-C interpreter understands the same arithmetic operators as C. • Has string manipulation functions, so it can search for particular strings and modify the output. • Has associative arrays, which are incredible useful.
Pattern Action Pairs condition { action } :
awk Syntax { [ statement ]. . . } variable=expression print [ expression-list ] [ > expression ] printf format [ , expression-list ] [ > expression ] next exit
awk Syntax - more if ( conditional ) statement [ else statement ] while ( conditional ) statement for ( expression ; conditional ; expression ) statement for ( variable in array ) statement break continue
BEGIN …. END BEGIN { do something before main body } condition { action – main body } END { do this after main body } e. g. , create a file called fields: #!/bin/awk –f BEGIN { FS = ": " } { print "Name: t ", $1 print "Year: t ", $2, "t. Movie: ", $3 } END { print "Number of records: t ", NR print "Number of fields: t ", NF } $ chmod 755 fields $ fields moviedb (or. /fields moviedb if you don’t have $PATTH set)
Run as awk or bash script? #!/bin/sh awk ‘ BEGIN { print "Using bash -f" } {print $8, "t", $3} END { print " -- completed --" } ‘ Or #!/bin/awk -f BEGIN { print "Using awk -f" } {print $8, "t", $3} END { print " -- completed --" }
Arithmetic Operators Operator Type Meaning + Arithmetic Addition - Arithmetic Subtraction * Arithmetic Multiplication / Arithmetic Division % Arithmetic Modulo <space> String Concatenation
Arithmetic Operators Examples Expression Result 8+5 13 8 -5 3 8*5 40 8/5 1. 6 8%5 3 8 5 85 What’s the output of: x = 2+1*3 8 Same as: (2+(1*3)) “ 8” “ 58”
#0 Put this code in a file called avg BEGIN { FS = "t" } #1 Expect 1 st record = number of students NR == 1 { print "Number of students: ", $1 total=0 next } #2 Add each record and add to total { print $1, "t", $2 total+=$2 } END { print "Average = ", total/NR } $ cp ~tan/public/scores. $ avg scores
# File: matchregex Counts number of lines matching regex BEGIN { total=0 } /^. . *: $/ { # line begins with a ". ", followed by any number of chars # and ends in a colon print “Found: ", $0 # $0 means whole line total += 1 } END { print "n-----------------" print "#Matches = ", total } $ chmod 755 match; cp ~tan/public/test. $ match test
Comparing Regex Operator Meaning ~ Matches !~ Doesn't match
# File: matchregex: Counts number of lines where 1 st arg matches regex BEGIN { total=0 } $1 ~ /^. [0 -9]+. */ { # line begins with a-m # followed by any number of char print "Found: t", $0 total += 1 } END { print "#Matches = ", total } $ chmod 755 matchregex $ matchregex test
# File: matchregex: Counts number of lines where 1 st arg matches regex BEGIN { total=0 } $1 ~ /^. [0 -9]+. */ { # line begins with a-m # followed by any number of char print "Found: t", $0 total += 1 } END { print "#Matches = ", total } $ chmod 755 matchregex $ matchregex test
Really Weird Syntax !!! Embedding awk in bash # Bash’s arguments vs. awk’s arguments # Find an acronym. File: lookup, DBfile: acronym (copy from ~tanjs/public) #!/bin/sh awk '$1 == find' find=$1 acronyms # or awk '$1 ~ find' find=$1 acronyms Parameters passed to awk are specified after awk script!! $ chmod 755 lookup $ lookup GOT acronyms
Arithmetic Operators Operator Type Meaning + Arithmetic Addition - Arithmetic Subtraction * Arithmetic Multiplication / Arithmetic Division % Arithmetic Modulo <space> String Concatenation
For loop #!/bin/awk –f BEGIN { sum=0 for (i=1; i <= 10; i++) { printf "The sum of integers up to : " printf sum+=i printf " is " print sum } # now end exit; }
Associative Arrays #!/bin/awk –f # filename: ass. Array BEGIN { FS = "t" } { acro[$1] = $2 } END { for ( abbrev in acro ) print abbrev, acro[abbrev] } $ ass. Array acronyms
bash & awk combo #!/bin/sh # arraylookup: look for an abbreviation in a file using associatve array # Syntax: arraylookup <abbrev> <file>" ass. Array $2 | grep $1 } $ arraylookup GOT acronyms
- Slides: 22