Manifold learning and pattern matching with entropic graphs
- Slides: 34
Manifold learning and pattern matching with entropic graphs Alfred O. Hero Dept. EECS, Dept Biomed. Eng. , Dept. Statistics University of Michigan - Ann Arbor hero@eecs. umich. edu http: //www. eecs. umich. edu/~hero
Multimodality Face Matching
Clustering Gene Microarray Data Cy 5/Cy 3 hybridization profiles
Image Registration
Vehicle Classification • 128 x 128 images of three vehicles over 1 deg increments of 360 deg azimuth at 0 deg elevation Truck T 62 HMMV Courtesy of Center for Imaging Science, JHU • The 3(360)=1080 images evolve on a lower dimensional imbedded manifold in R^(16384)
Image Manifold
What is manifold learning good for? • Interpreting high dimensional data • Discovery and exploitation of lower dimensional structure • Deducing non-linear dependencies between populations • Improving detection and classification performance • Improving image compression performance
Random Sampling on a Manifold
Classifying on a Manifold Class A Class B
Background on Manifold Learning • Manifold intrinsic dimension estimation – – Local KLE, Fukunaga, Olsen (1971) Nearest neighbor algorithm, Pettis, Bailey, Jain, Dubes (1971) Fractal measures, Camastra and Vinciarelli (2002) Packing numbers, Kegl (2002) • Manifold Reconstruction – – Isomap-MDS, Tenenbaum, de Silva, Langford (2000) Locally Linear Embeddings (LLE), Roweiss, Saul (2000) Laplacian eigenmaps (LE), Belkin, Niyogi (2002) Hessian eigenmaps (HE), Grimes, Donoho (2003) • Characterization of sampling distributions on manifolds – Statistics of directional data, Watson (1956), Mardia (1972) – Statistics of shape, Kendall (1984), Kent, Mardia (2001) – Data compression on 3 D surfaces, Kolarov, Lynch (1997)
Sampling on a Domain Manifold 2 D manifold Embedding Sampling distribution Assumption: Sampling A statistical sample is a conformal mapping
Alpha-Entropy and Divergence • Alpha-entropy • Alpha-divergence • Other alpha-dissimilarity measures – Alpha-Jensen difference – Alpha geometric-arithmetic (GA) divergence
MST and Geodesic MST • For a set of points in ddimensional Euclidean space, the Euclidean MST with edge power weighting gamma is defined as • edge lengths of a spanning tree over • pairwise distance matrix of complete graph • When the matrix is constructed from geodesic distances between points on , e. g. using ISOMAP, we obtain the Geodesic MST
A Planar Sample and its Euclidean MST
Convergence of Euclidean MST Beardwood, Halton, Hammersley Theorem:
Key Result for GMST Ref: Costa&Hero: TSP 2003
Special Cases • Isometric embedding (ISOMAP) • Conformal embedding (C-ISOMAP)
Remarks • Result holds for many other combinatorial optimization algorithms (Costa&Hero: 2003) – – K-NNG Steiner trees Minimal matchings Traveling Salesman Tours • a. s. convergence rates (Hero&etal: 2002) • For isometric embeddings Jacobian does not have to be estimated for dimension estimation
Joint Estimation Algorithm • Assume large-n log-affine model • Use bootstrap resampling to estimate mean MST length and apply LS to jointly estimate slope and intercept from sequence • Extract d and H from slope and intercept
Random Samples on a Swiss Roll • Ref: Grimes and Donoho (2003)
Bootstrap Estimates of GMST Length
loglog. Linear Fit to GMST Length
Dimension and Entropy Estimates • From LS fit find: • Intrinsic dimension estimate • Alpha-entropy estimate (nats)
Dimension Estimation Comparisons
Practical Application • Yale face database 2 – Photographic folios of many people’s faces – Each face folio contains images at 585 different illumination/pose conditions – Subsampled to 64 by 64 pixels (4096 extrinsic dimensions) • Objective: determine intrinsic dimension and entropy of a face folio
GMST for 3 Face Folios
GMST for 3 Face Folios
Yale Face Database Results Ref: Costa&Hero 2003 • GMST LS estimation parameters – ISOMAP used to generate pairwise distance matrix – LS based on 25 resamplings over 26 largest folio sizes • To represent any folio we might hope to attain – factor > 600 reduction in degrees of freedom (dim) – only 1/10 bit per pixel for compression – a practical parameterization/encoder?
Conclusions Advantages of Geodesic Entropic Graph Methods • Characterizing high dimension sampling distributions – Standard techniques (histogram, density estimation) fail due to curse of dimensionality – Entropic graphs can be used to construct consistent estimators of entropy and information divergence – Robustification to outliers via pruning • Manifold learning and model reduction – Standard techniques (LLE, MDS, LE, HE) rely on local linear fits – Entropic graph methods fit the manifold globally – Computational complexity is only n log n
Summary of Algorithm • Run ISOMAP or C-ISOMAP algorithm to generate pairwise distance matrix on intrinsic domain of manifold • Build geodesic entropic graph from pairwise distance matrix – MST: consistent estimator of manifold dimension and process alpha-entropy – K-NNG: consistent estimator of information divergence between labeled vectors • Use bootstrap resampling and LS fitting to extract rate of convergence (intrinsic dimension) and convergence factor (entropy) of entropic graph
Swiss Roll Example Uniform Samples on 3 D Imbedding of Swiss Roll
Geodesic Minimal Spanning Tree GMST over Uniform Samples on Swiss Roll
Geodesic MST on Imbedded Mixture GMST on Gaussian Samples on Swiss Roll
Classifying on a Manifold Class A Class B
- State testing and testability tips
- Graphs that enlighten and graphs that deceive
- Speed and velocity
- End behavior chart
- Simple classification and manifold classification
- Template matching pattern recognition
- Image search reverse
- What is apatri
- Graph pattern matching algorithm
- Rabin karp algorithm pseudocode
- Pattern matching
- Subsequence pattern matching
- Cuadro comparativo e-learning b-learning m-learning
- Specular manifold sampling
- Steam tracing
- Testing the manifold hypothesis
- Collision manifold
- A typical map compares the vacuum in the intake manifold to
- A typical map compares the vacuum in the intake manifold to
- Manifold ug
- One dimensional manifold
- Manifold clustering
- Exhaust manifold design
- Genesys manifold system
- Manifold gis tutorial
- Hydraulic manifold design calculations
- Split manifold plug-in type
- Multiport fuel injection system
- William l hamilton
- Fingerprint minutiae types
- Pattern and pattern classes in image processing
- L
- Nfrequent
- Cm bishop pattern recognition and machine learning
- Inductive and analytical learning