Searching with Wildcards and Advanced Searching with Regular

  • Slides: 28
Download presentation
Searching with Wildcards and Advanced Searching with Regular Expressions (Regex)

Searching with Wildcards and Advanced Searching with Regular Expressions (Regex)

Ant. Conc Search Wildcards (not Regex) * = zero or more characters + =

Ant. Conc Search Wildcards (not Regex) * = zero or more characters + = zero or one character ? = exactly one character | = or @ = zero or one word # = any one word and you can use context words through Advanced Search option.

Practice 1 (Wildcards) • Search for all occurrences of color without excluding the British

Practice 1 (Wildcards) • Search for all occurrences of color without excluding the British version colour.

Practice 1 (Wildcards) • Search for all occurrences of color without excluding the British

Practice 1 (Wildcards) • Search for all occurrences of color without excluding the British version colour. • Answers: • colo+r • color|colour

Practice 2 • You want to know how we use the verb connect with

Practice 2 • You want to know how we use the verb connect with the prepositions to and with. (Wildcards, Untagged Corpus)

Practice 2 • Find what kind of prepositions come before years.

Practice 2 • Find what kind of prepositions come before years.

Practice • Find what kind of prepositions come before years. • You need to

Practice • Find what kind of prepositions come before years. • You need to use regular expressions to answer this question.

Regular Expressions (Reg. Ex) • Regex enables you to represent a piece of string

Regular Expressions (Reg. Ex) • Regex enables you to represent a piece of string using special symbols. • You can use Regex with Ant. Conc, Monoconc, Wordsmith, and many other programming languages, such as PERL, JAVA, and so on.

 • Regex helps us find patterns in a text • Regex helps us

• Regex helps us find patterns in a text • Regex helps us manipulate patterns in a text (with the help of a programming language)

Four Important Steps • First, think of an example. • Second, know how that

Four Important Steps • First, think of an example. • Second, know how that would exist in the corpus. • Third, convert the example into regex • Fourth, refine your regex depending on the problems.

Practice 1 • You want to search for both color and colour. • (Untagged

Practice 1 • You want to search for both color and colour. • (Untagged Corpus)

Practice 1 • You want to search for both color and colour. Answer: •

Practice 1 • You want to search for both color and colour. Answer: • bcolou? r • b = word boundary • ? = preceding character should happen one or zero times.

Practice 2 • Find all abbreviations in a text. (Untagged corpus, Regex) • Answer:

Practice 2 • Find all abbreviations in a text. (Untagged corpus, Regex) • Answer:

Practice 2 • Find all abbreviations in a text. • Answer: • bp{lu}{2, }

Practice 2 • Find all abbreviations in a text. • Answer: • bp{lu}{2, }

Practice 3 • Find likely one-word linking adverbials in a corpus. (Untagged Corpus, Regex)

Practice 3 • Find likely one-word linking adverbials in a corpus. (Untagged Corpus, Regex)

Practice 3 • Find likely one-word linking adverbials in a corpus. • Answer: p{Lu}p{Ll}+,

Practice 3 • Find likely one-word linking adverbials in a corpus. • Answer: p{Lu}p{Ll}+,

Practice 4 • Now do the same thing but restrict the adverbials only to

Practice 4 • Now do the same thing but restrict the adverbials only to the ones that shows up as the first word in a paragraph. (Untagged Corpus, Regex) • For example: (from NYTimes) Still, I’d hear whisperings about stuffing with other ingredients, like wild rice or ground meat. There was a whole other turkey-cavity universe out there. And this year, I felt ready to meet it.

Practice 4 • Now do the same thing but restrict the adverbials only to

Practice 4 • Now do the same thing but restrict the adverbials only to the ones that shows up as the first word in a paragraph. • Answer: nW+p{Lu}p{Ll}+,

Practice 5 • Get the likely linking-adverbials, that occur in the beginning of a

Practice 5 • Get the likely linking-adverbials, that occur in the beginning of a sentence, and are composed of 2 to 4 words. (Untagged Corpus, Regex)

Practice 5 • Get the likely linking-adverbials, that occur in the beginning of a

Practice 5 • Get the likely linking-adverbials, that occur in the beginning of a sentence, and are composed of 2 to 4 words. • Answer: p{Lu}p{Ll}+s(p{Ll}+s){1, 2},

Practice 6 • Search for the verb “write” with all of its inflections: write,

Practice 6 • Search for the verb “write” with all of its inflections: write, writes, wrote, written, writing. (Untagged Corpus, Regex)

Practice 7 Find all occurrences of “as” (subordinator) (Tagged Corpus, Regex) 1. As they

Practice 7 Find all occurrences of “as” (subordinator) (Tagged Corpus, Regex) 1. As they watched, a flash of fire appeared.

Practice 7 Find all occurrences of “as” (subordinator) (Tagged Corpus, Regex) 1. As they

Practice 7 Find all occurrences of “as” (subordinator) (Tagged Corpus, Regex) 1. As they watched, a flash of fire appeared. • As_LLL�they_LLL�watched_LLL�, _,

Practice 7 Find all occurrences of “as” (subordinator) (Tagged Corpus, Regex) 1. As they

Practice 7 Find all occurrences of “as” (subordinator) (Tagged Corpus, Regex) 1. As they watched, a flash of fire appeared. • As_LLL�they_LLL�watched_LLL�, _, • bas_p{Lu}+s(w+_p{Lu}+s){2, 6},

Practice 8 • Find the prepositions come before years. (Tagged Corpus)

Practice 8 • Find the prepositions come before years. (Tagged Corpus)

Practice 8 • Find the prepositions come before years. • Answer: • _INsdd

Practice 8 • Find the prepositions come before years. • Answer: • _INsdd

Practice 9 • Find all adjectives ending in –ly (Tagged corpus)

Practice 9 • Find all adjectives ending in –ly (Tagged corpus)

Practice 9 • Find all adjectives ending in –ly and followed by a noun

Practice 9 • Find all adjectives ending in –ly and followed by a noun (Tagged Corpus, Regex) • Answer: • b*ly_JJbs*_NNw? b