Advanced Algorithms 4995 2 Nearest Neighbor Search Alex
Advanced Algorithms (4995 -2) Nearest Neighbor Search Alex Andoni 4/23/20
Find pairs of similar images how should we measure similarity? 2
Measuring similarity 000000 011100 010100 000100 011111 000000 001100 000100 110100 111111 Sets of points Earth-Mover Image courtesy of Kristen Grauman Distance
Nearest Neighbor Search (NNS) � 4
Preamble: How to check for an exact match ? Dictionary problem: hashing NNS for 1 -dimensional vectors ? 1. 1 2 000000 000000 5 3. 5 000000 001100 000100 110100 111111 5 … 6 000000 011100 010100 000100 011111 7 7. 3 Preprocess: Sort the points 001100 011100 001100 Query: Perform binary search Query time Space
High-dimensional case Underprepared: no preprocessing Overprepared: store an answer for every possible query Algorithm No indexing Query time Full indexing Best indexing ? A little better indexing ? 6 Space
Relaxed problem: Approximate Near Neighbor Search � similar not similar kind of… either way 7
Approach: Locality Sensitive Hashing [Indyk-Motwani’ 98] ized om rand � several h-tables 1 How to construct good maps? 8
Locality sensitive hash functions � 9
Full algorithm � 10
Analysis of LSH Scheme � collision probability 11 distance
Analysis: Correctness � 12
Analysis: Runtime � 13
LSH maps (Euclidean space) [Datar-Indyk-Immorlica-Mirrokni’ 04, A. -Indyk’ 06] Space Time Exponent Even better LSH maps? 14 NO: example of isoperimetry [Motwani-Naor-Panigrahy’ 06, O’Donell-Wu. Zhou’ 11]
� be not or sketch to not to be 1 … 11101… … 01111… 1 … 21102… … 01122… {be, not, or, to} be 15 To search or not to search be not or search to Some other LSH algorithms To be or {not, or, to, searc h} to
A data-dependent LSH [A. -Razenshteyn-Shekel-Nosatzki’ 17] � [Indyk-Motwani’ 98] coor 3 coor 2 coor 5 coor 7 16
NNS for other distances? �Earth Mover Distance (Wasserstein): � Given two sets A, B of points in a metric space � EMD(A, B) = min cost bipartite matching between A and B � Applications in image vision 010110 010101
Tool: metric embeddings �
Metric Upper bound edit( banana , ananas ) = 2 Ulam (edit distance between permutations) Block edit distance edit(1234567, 7123456) = 2
Metric Upper bound Lower bounds Ulam (edit distance between permutations) Block edit distance 4/3 [Cor 03]
- Slides: 20