Manifold learning and pattern matching with entropic graphs

  • Slides: 34
Download presentation
Manifold learning and pattern matching with entropic graphs Alfred O. Hero Dept. EECS, Dept

Manifold learning and pattern matching with entropic graphs Alfred O. Hero Dept. EECS, Dept Biomed. Eng. , Dept. Statistics University of Michigan - Ann Arbor hero@eecs. umich. edu http: //www. eecs. umich. edu/~hero

Multimodality Face Matching

Multimodality Face Matching

Clustering Gene Microarray Data Cy 5/Cy 3 hybridization profiles

Clustering Gene Microarray Data Cy 5/Cy 3 hybridization profiles

Image Registration

Image Registration

Vehicle Classification • 128 x 128 images of three vehicles over 1 deg increments

Vehicle Classification • 128 x 128 images of three vehicles over 1 deg increments of 360 deg azimuth at 0 deg elevation Truck T 62 HMMV Courtesy of Center for Imaging Science, JHU • The 3(360)=1080 images evolve on a lower dimensional imbedded manifold in R^(16384)

Image Manifold

Image Manifold

What is manifold learning good for? • Interpreting high dimensional data • Discovery and

What is manifold learning good for? • Interpreting high dimensional data • Discovery and exploitation of lower dimensional structure • Deducing non-linear dependencies between populations • Improving detection and classification performance • Improving image compression performance

Random Sampling on a Manifold

Random Sampling on a Manifold

Classifying on a Manifold Class A Class B

Classifying on a Manifold Class A Class B

Background on Manifold Learning • Manifold intrinsic dimension estimation – – Local KLE, Fukunaga,

Background on Manifold Learning • Manifold intrinsic dimension estimation – – Local KLE, Fukunaga, Olsen (1971) Nearest neighbor algorithm, Pettis, Bailey, Jain, Dubes (1971) Fractal measures, Camastra and Vinciarelli (2002) Packing numbers, Kegl (2002) • Manifold Reconstruction – – Isomap-MDS, Tenenbaum, de Silva, Langford (2000) Locally Linear Embeddings (LLE), Roweiss, Saul (2000) Laplacian eigenmaps (LE), Belkin, Niyogi (2002) Hessian eigenmaps (HE), Grimes, Donoho (2003) • Characterization of sampling distributions on manifolds – Statistics of directional data, Watson (1956), Mardia (1972) – Statistics of shape, Kendall (1984), Kent, Mardia (2001) – Data compression on 3 D surfaces, Kolarov, Lynch (1997)

Sampling on a Domain Manifold 2 D manifold Embedding Sampling distribution Assumption: Sampling A

Sampling on a Domain Manifold 2 D manifold Embedding Sampling distribution Assumption: Sampling A statistical sample is a conformal mapping

Alpha-Entropy and Divergence • Alpha-entropy • Alpha-divergence • Other alpha-dissimilarity measures – Alpha-Jensen difference

Alpha-Entropy and Divergence • Alpha-entropy • Alpha-divergence • Other alpha-dissimilarity measures – Alpha-Jensen difference – Alpha geometric-arithmetic (GA) divergence

MST and Geodesic MST • For a set of points in ddimensional Euclidean space,

MST and Geodesic MST • For a set of points in ddimensional Euclidean space, the Euclidean MST with edge power weighting gamma is defined as • edge lengths of a spanning tree over • pairwise distance matrix of complete graph • When the matrix is constructed from geodesic distances between points on , e. g. using ISOMAP, we obtain the Geodesic MST

A Planar Sample and its Euclidean MST

A Planar Sample and its Euclidean MST

Convergence of Euclidean MST Beardwood, Halton, Hammersley Theorem:

Convergence of Euclidean MST Beardwood, Halton, Hammersley Theorem:

Key Result for GMST Ref: Costa&Hero: TSP 2003

Key Result for GMST Ref: Costa&Hero: TSP 2003

Special Cases • Isometric embedding (ISOMAP) • Conformal embedding (C-ISOMAP)

Special Cases • Isometric embedding (ISOMAP) • Conformal embedding (C-ISOMAP)

Remarks • Result holds for many other combinatorial optimization algorithms (Costa&Hero: 2003) – –

Remarks • Result holds for many other combinatorial optimization algorithms (Costa&Hero: 2003) – – K-NNG Steiner trees Minimal matchings Traveling Salesman Tours • a. s. convergence rates (Hero&etal: 2002) • For isometric embeddings Jacobian does not have to be estimated for dimension estimation

Joint Estimation Algorithm • Assume large-n log-affine model • Use bootstrap resampling to estimate

Joint Estimation Algorithm • Assume large-n log-affine model • Use bootstrap resampling to estimate mean MST length and apply LS to jointly estimate slope and intercept from sequence • Extract d and H from slope and intercept

Random Samples on a Swiss Roll • Ref: Grimes and Donoho (2003)

Random Samples on a Swiss Roll • Ref: Grimes and Donoho (2003)

Bootstrap Estimates of GMST Length

Bootstrap Estimates of GMST Length

loglog. Linear Fit to GMST Length

loglog. Linear Fit to GMST Length

Dimension and Entropy Estimates • From LS fit find: • Intrinsic dimension estimate •

Dimension and Entropy Estimates • From LS fit find: • Intrinsic dimension estimate • Alpha-entropy estimate (nats)

Dimension Estimation Comparisons

Dimension Estimation Comparisons

Practical Application • Yale face database 2 – Photographic folios of many people’s faces

Practical Application • Yale face database 2 – Photographic folios of many people’s faces – Each face folio contains images at 585 different illumination/pose conditions – Subsampled to 64 by 64 pixels (4096 extrinsic dimensions) • Objective: determine intrinsic dimension and entropy of a face folio

GMST for 3 Face Folios

GMST for 3 Face Folios

GMST for 3 Face Folios

GMST for 3 Face Folios

Yale Face Database Results Ref: Costa&Hero 2003 • GMST LS estimation parameters – ISOMAP

Yale Face Database Results Ref: Costa&Hero 2003 • GMST LS estimation parameters – ISOMAP used to generate pairwise distance matrix – LS based on 25 resamplings over 26 largest folio sizes • To represent any folio we might hope to attain – factor > 600 reduction in degrees of freedom (dim) – only 1/10 bit per pixel for compression – a practical parameterization/encoder?

Conclusions Advantages of Geodesic Entropic Graph Methods • Characterizing high dimension sampling distributions –

Conclusions Advantages of Geodesic Entropic Graph Methods • Characterizing high dimension sampling distributions – Standard techniques (histogram, density estimation) fail due to curse of dimensionality – Entropic graphs can be used to construct consistent estimators of entropy and information divergence – Robustification to outliers via pruning • Manifold learning and model reduction – Standard techniques (LLE, MDS, LE, HE) rely on local linear fits – Entropic graph methods fit the manifold globally – Computational complexity is only n log n

Summary of Algorithm • Run ISOMAP or C-ISOMAP algorithm to generate pairwise distance matrix

Summary of Algorithm • Run ISOMAP or C-ISOMAP algorithm to generate pairwise distance matrix on intrinsic domain of manifold • Build geodesic entropic graph from pairwise distance matrix – MST: consistent estimator of manifold dimension and process alpha-entropy – K-NNG: consistent estimator of information divergence between labeled vectors • Use bootstrap resampling and LS fitting to extract rate of convergence (intrinsic dimension) and convergence factor (entropy) of entropic graph

Swiss Roll Example Uniform Samples on 3 D Imbedding of Swiss Roll

Swiss Roll Example Uniform Samples on 3 D Imbedding of Swiss Roll

Geodesic Minimal Spanning Tree GMST over Uniform Samples on Swiss Roll

Geodesic Minimal Spanning Tree GMST over Uniform Samples on Swiss Roll

Geodesic MST on Imbedded Mixture GMST on Gaussian Samples on Swiss Roll

Geodesic MST on Imbedded Mixture GMST on Gaussian Samples on Swiss Roll

Classifying on a Manifold Class A Class B

Classifying on a Manifold Class A Class B