Dependence Dependence NOT Independent Only 1 way to








































- Slides: 40
Dependence • Dependence = NOT Independent • Only 1 way to be independent VS Massachusetts Institute of Technology
Estimating Dependency and Significance for High-Dimensional Data Michael R. Siracusa* Kinh Tieu*, Alexander T. Ihler §, John W. Fisher *§, Alan S. Willsky § * Computer Science and Artificial Intelligence Laboratory § Laboratory for Information and Decision Systems Massachusetts Institute of Technology
Applications Massachusetts Institute of Technology
Problem Statement Given N i. i. d. observations for K sources Determine if the K sources are independent or not by 1. Calculating some dependency measure 2. Estimating the significance of this measurement Massachusetts Institute of Technology
Hypothesis Test Two Hypotheses: Assuming we know the distributions: Given N i. i. d. observations: Massachusetts Institute of Technology
Factorization Test Two Factorizations: But we don’t we know the distributions: Our best approximation (like GLR): Given N i. i. d. observations: Massachusetts Institute of Technology
Factorization Test (cont) Given N i. i. d. observations: True Joint Dist True Independent Dist Massachusetts Institute of Technology
Factorization Test (cont) For 2 variable case For 2 variable Gaussian case In general: Questions: • How do we do density estimation? • Can we compute this value when x is high dimensional • How do we make our decision between F 1 and F 0 Massachusetts Institute of Technology
Sample Based Density Estimates Massachusetts Institute of Technology
High Dimensional Data From the Data Processing Inequality: VS Massachusetts Institute of Technology
High Dimensional Data (cont) Sufficiency: For High dimensional data Maximize left side of bound Gaussian w/ Linear Projections • Close form solution (Eigenvalue problem): Kullback 68 Nonparametric • Gradient descent : Ihler and Fisher 03 Massachusetts Institute of Technology
Swiss Roll PCA 2 D Projection Max. KL 2 D Optimization 3 D Data Massachusetts Institute of Technology
Significance Massachusetts Institute of Technology
More significance Massachusetts Institute of Technology
Synthetic data High Dim Obs Low Dim Latent Var Dependency via Distracter Noise in High Dim Space M: Controls that number of dimensions dependency info is uniformly distributed over D: Controls the total dimensionality of our K observations Massachusetts Institute of Technology
Experiments • 100 Trial w/ Samples of Dependent Data • 100 Trials w/ Samples of Independent Data • Each trial gives a statistic and significance pf Massachusetts Institute of Technology
Gaussian Data Massachusetts Institute of Technology
Gaussian rho =. 75 Massachusetts Institute of Technology
Three D Ball Data Massachusetts Institute of Technology
Massachusetts Institute of Technology
Significance: Permutations good Massachusetts Institute of Technology
Multi-camera Massachusetts Institute of Technology
Conclusions • Nice General Framework • Permutations allow us to draw independent samples • Have shown cases where Gaussian assumptions will fail, and PCA is no good for dimensionality reduction Massachusetts Institute of Technology
Future Work • More experiments on real data • Better optimization procedure Massachusetts Institute of Technology
Massachusetts Institute of Technology
Massachusetts Institute of Technology
Applications • What Vision Problems Can We Solve w/ Accurate Measures of Dependency? § Data Association, Correspondence § Feature Selection § Learning Structure • We will specifically discuss: § Correspondence (for multi-camera tracking) § Audio-visual Association Massachusetts Institute of Technology
Audio-Visual Association Useful For: • Speaker Localization - Help improve Human-Computer Interaction - Help Source Separation • Automatic Transcription of Archival Video - Who is speaking? - Are they seen by the camera? Massachusetts Institute of Technology
Multi-camera Tracking Massachusetts Institute of Technology
Hypotheses Camera X Camera Y VS Massachusetts Institute of Technology
Maximal Correspondence Massachusetts Institute of Technology
Distributions of Transition Times Massachusetts Institute of Technology Transition time
Discussion and Future Work • Dependence underlies various vision related problems. • We studied a framework for measuring dependence. • Measure significance (how confident are you) • Make it more robust. Massachusetts Institute of Technology
Massachusetts Institute of Technology
Massachusetts Institute of Technology
Math (oh no!) For 2 variable case Massachusetts Institute of Technology
Outline • Applications: (for computer vision) • Problem Formulation: (Hypothesis Testing) • Computation: (Non-parametric entropy estimation) • Curse of Dimensionality: (Informative Statistics) • Correspondence: (Markov Chain Monte Carlo) Massachusetts Institute of Technology
Massachusetts Institute of Technology
Massachusetts Institute of Technology
Previous Talks • Greg: Model dependence between features and class • Kristen: Model dependence between features and a scene Ariadna: Model dependency between intra-class features • Wanmei: Dependency between protocol signal and voxel response • Chris: Audio and video dependence with events • Antonio: Contextual Dependence • Corey: “Inferring Dependencies” Massachusetts Institute of Technology