What is String Matching? • Used in word find in document, as well as in the spell checker and in internet keyword searches • Looking for an exact string match • Reality of algorithms are more complicated; search string ‘string’ results in ‘String’ as well as ‘stringbean’
How do you match strings? • Finite-State-Automota • Brute-Force • Knuth-Morris-Pratt (KMP) • visualization tool for Brute Force and KMP www. dcc. ufmg. br/~cassia/smaa/english/
Virus Detection • Detection of virus is simply searching for a pattern string in a larger text. 1 ) viral signature (contagious seg. ) matching 2 ) code enumeration (cmp. to old known file) 3 ) checksum methods (see size of file)
Variation-tolerant matching • Fast substring matching • approximate string matching – voice recognition – dna sequencing
Example: x = GATAA and y = CAGATAAGAGAA and k = 1
Example: x = GATAA and y = CAGATAAGAGAA and k = 1
Summary • Exact string matching good for grep & sed • String matching used in word find and in internet key word searches • KMP alg. is slightly better than Brute Force • approximate string matching and fast substring matching can be used for a wider use to practical applications.