SC A L E D Pattern Matching Amihood
- Slides: 38
SC A L E D Pattern Matching Amihood Amir Bar-Ilan University Ayelet Butman Moshe Lewenstein and Johns Hopkins University Bar- Ilan University
Motivation Searching for Templates in Aerial Photographs Input: Aerial photo Template Task: Search for all locations where the template appears in the image.
Model • Low level (pixel level) avoid costly processing • Asymptotically efficient solutions. • Serial, exact algorithms.
Types of Approximations Local errors: Level of detail Occlusion Noise results: O(n² log m) mismatches O(n²k²) edit distance, k errors, AL-88 rectangular patterns. O(n²k√(m log m) √(k log k) edit distance, k errors, half rectangular patterns AF-95
Types of Approximation Orientation. results: O(n²m 5 ) FU-98 O(n²m³) ACL-98 Scaling: Natural scales: results: O(n) 1 -d EV-88 O(n² log |Σ|) 2 -d ALV-92 O(n²) dictionary AC-96 Real scales: this result: O(n) 1 -d, truncation
It seems daunting, but…
CPM 2003: Morelia, Mexico
Problem inherently inexact What if occurrence is 1½ times bigger? What is the meaning of “½ a pixel”? Solutions until now: Natural Scales Consider only discrete scales: 1, 2, 3, 4, 5, . . .
Text: n Definition: Pattern: m m n Find all occurrences of the pattern in the text in all discrete sizes.
Discrete exact Scaled Matching T AAAAAA AA AAAAACCAAAAAAAA AA AAAAAAAAACCAA AAACCCAAAAAAA AAACCCAAAACAA AAAAAA AA AAAAACCAC AAAAAA AA P AAA ACA AAA
Discrete exact Scaled Matching P³ P ZUY KVS XET ZZZUUUYYY KKKVVVSSS XXXEEETTT
Idea: Fix a scale s s s n/s n Constant amount of work for each square (s-block)
Algorithm time Time for scale s: Total time: converges to a constant Making the total time O(n²)
Problem: Real scales Was open even for strings… How do we define? aabcccbb Scaled to 2: aaaabbccccccbbbb Scaled to 1½: aaab cccc bbb truncate ½b ½c
Formally: r r times Denote: a aaa. . . a Problem Definition 1: Input: Pattern Text: Output: All text locations where appears for some
Remark α ≥ 1 means we only scale “up” Reasons: Avoid conceptual problem of loss of resolution. From “far enough” away everything looks the same. By our definition, for k<1/m there is a match at every text location.
Simplify definition Definition 2: Look for in the text. Example: P=aabcccbbbb Match by definition 2: daaabccccbbbbbbe Match by definition 1 but not by def 2: daaaabccccbbbbbbbe
Why are definitions equivalent? Split text and pattern to symbol part Ts , Ps and length part TL , PL. Example: P= aabcccbbbb Ps=abcb PL=2134 T=daaabccccbbbbbbe Ts=dabcbe TL=131461
Time for split: O(n+m) Finding Ps in Ts: O(n+m) (e. g. KMP) HARD PART: Finding PL in TL.
Definitions are Equivalent Claim: Solving def 2 in time O(f(n)) Solving def 1 in time O(f(n)). Why? - Find in time O(f(n)) - For each match verify 1 st and last symbol in constant time in Ts and TL. Total time: O(f(n)+n)=O(f(n)).
Naïve algorithm for matching PL in TL For each text location, position pattern starting at that location and calculate interval [t/p, (t+1)/p) for each resulting <text, pattern> pair. This is the interval of possible scales since t/p·p = t for every α < t/p, |αp| < t (t+1)/p ·p = t+1 for every α ≥ t/p, |αp| > t
Check intersection If intersection of all intervals is not empty then there is a match. Time: O(nm) Example: PL: 2 1 2 3 2 T L: 2 4 7 4 5 3 [1, 3/2) [4, 5) The intersection is empty thus no scaled match in location 1. But…
Check intersection If intersection of all intervals is not empty then there is a match. Time: O(nm) Example: PL: 2 1 2 3 2 T L: 2 4 7 4 5 3 [2, 5/2) [2, 3) [2, 5/2)[7/3, 8/3)[2, 5/2) The intersection is [7/3, 5/2) thus there is a scaled match in location 2.
Improvement – Parameterized Matching Introduced: Baker 1994. Motivation: “copying” code.
Parameterized Matching Input: two strings s and t |s|=|t|, over alphabets ∑s and ∑t. s parameterize matches t: if bijection : ∑s ∑t , such that (s) = t. Example: (a)=x (b)=y a b b x yx y y
Parameterized Matching Claim (AFM-94): For Σ that can be sorted in linear time (e. g. Σ={1, . . . , n}) Parameterized matching can be done in time O(n).
The reduction Lemma: for which PL matches TL at location i scaled to α only if PL p-matches TL at i. Proof: Assume PL does not p-match TL at location i. The possible situations are:
Possibility 1 w. l. o. g. c ≥ a+1 TL a c≠a PL b b For c = a+1 (smallest possible):
Possibility 2 TL PL a a w. l. o. g. c ≥ b+1 c≠b b Intersection not empty only if: (a+1)/(b+1) > a/b i. e. ab+b > ab+a b>a But this can never happen if α ≥ 1.
Algorithm for Real Scaled String Matching Let { Pi 1, Pi 2, . . . , Pij } be the different numbers in PL. 1. P-match PL in TL. 2. For each match, chack intersection of intervals between Pi 1, . . . , Pij and corresponding symbols in TL. End Algorithm
Example: PL = 2 3 2 Pi 1=2 Pi 2=3 p-matches TL = 5 6 5 6 10 7 scaled match
Important Fact: So there at most O(√m) different Pik’s. Time: O(n) for parameterized matching (Σ={1, 2, …, n}). O(√m) verification for each location. Total: O(n√m).
Tighter analysis Upper bound number of possible p-matches. Lemma: Let |P|=m, |T|=n, { Pi 1, Pi 2, . . . , Pij } be the different numbers in PL. Then there at most n/2 j p-matches of PL in TL. Meaning: Since verification time is O(j) per p-match, the lemma implies that total verification time is: O((n/2 j) · j) = O(n)
Proof of Lemma: 1 st appearance of Pi 1, . . . , Pij PL P i 1 P i 2 P ij TL a 1 a 2 aj m-match
Lemma’s proof (cont. ) Let x be the total number of p-matches in the text. The sum of all text elements that match 1 st occurrences of Pik‘s in the pattern ≥ (xj²)/2 But: There are overlaps! How many?
Lemma’s proof (cont. ) For each text location, at most j matches will count it. Therefore… Total count without overlaps ≥ Clearly: x·j/2 ≤ n thus x ≤ (2 n)/j
Open Problem: Give 1 -d algorithm linear in run-length compressed text and pattern.
- Greedy algorithm
- Brute force pattern matching
- Flexible pattern
- Pattern matching
- Font matching
- Longest common subsequence applications
- Graph pattern matching algorithm
- Chamfer matching
- Patterns and pattern classes in digital image processing
- Frequent pattern
- Nfrequent
- Afis fingerprint
- Matching supply with demand
- A guided tour to approximate string matching
- Jingles are the message written around the brand
- Alan hastings texas instruments
- Scan matching
- Comprehensive strategic management model
- Matching engine architecture
- How is john proctor a dynamic character
- Matching by equating participants
- L matching network
- Education is not mere bookish knowledge
- International business opportunities in dubai
- Matching hypothesis
- Name matching algorithm
- String matching
- Galatea of the spheres meaning
- Pengertian pola digital
- Hungarian maximum matching algorithm
- Improving vocabulary skills 5th edition
- Stable matching
- What is syllables
- Matching familiar figures test
- Reported question exercise
- Cv matching
- Chapter 19 matching words with definitions
- Case control spss
- Template matching