Latent Dirichlet Allocation David M Blei Andrew Y
Latent Dirichlet Allocation David M. Blei, Andrew Y. Ng, Michael I. Jordan Journal of Machine Learning Research 3 (2003), pp 993 -1022 Presented by: Aditya M. Prakash & Arsenios N. Tsokas Advanced Machine Learning, Fall 2017
The Information Retrieval Problem Text Corpora Modelling Ø Given a collection of documents, find a short representation of its members. Ø Preserve statistical relationships necessary for further analysis Ø Applied in text modelling and discrete data in general. Advanced Machine Learning, Fall 2017 2
The bag-of-words assumption The TF-IDF scheme Advanced Machine Learning, Fall 2017 3
The bag-of-words assumption After TF-IDF: A Simple Approach Ø Cosine similarity of document-vectors Ø Example: Wiki. Leaks Iraq SIGACT ("significant action") reports from December 2006 [J. Stray, J. Burgess, 2010] Ø Threshold 0. 6 for edges Ø Clusters with dominant words appear. Advanced Machine Learning, Fall 2017 4
The bag-of-words assumption After TF-IDF: Latent Semantic Indexing (LSI) Ø Apply singular value decomposition to the TF-IDF matrix. Ø Achieves significant compression Ø Captures synonymy and polysemy Ø Probabilistic LSI (p-LSI) or Aspect Model is a significant step forward. Ø Each document is modelled as a probability distribution Ø Both LSI and p-LSI fail to provide a generative probabilistic model of corpora. Advanced Machine Learning, Fall 2017 5
Latent Variable Models The Unigram Model Advanced Machine Learning, Fall 2017 6
Latent Variable Models The Mixture of Unigrams Advanced Machine Learning, Fall 2017 7
Latent Variable Models Probabilistic Latent Semantic Indexing (p. LSI) Advanced Machine Learning, Fall 2017 8
Latent Variable Models Latent Dirichlet Allocation (LDA) Advanced Machine Learning, Fall 2017 9
Latent Variable Models Latent Dirichlet Allocation (LDA) Advanced Machine Learning, Fall 2017 10
Latent Variable Models A comparison Advanced Machine Learning, Fall 2017 11
Latent Variable Models A comparison Advanced Machine Learning, Fall 2017 12
Gaussian Mixture Formulation Advanced Machine Learning, Fall 2017 13
Gaussian Mixtures EM Algorithm for parameter inference Advanced Machine Learning, Fall 2017 14
EM for Gaussian Mixtures EM Algorithm for parameter inference 3. M Step: Re-estimate parameters using current responsibilities by maximizing log-likelihood 4. Check for convergence Advanced Machine Learning, Fall 2017 15
EM for Gaussian Mixtures Iteration 1 -E step Iteration 1 - M step Final Clustering Advanced Machine Learning, Fall 2017 16
Inference for LDA Advanced Machine Learning, Fall 2017 17
Variational Inference Ø Due to intractability an approximate distribution (�� ) is assumed which acts as a tight lower bound Advanced Machine Learning, Fall 2017 18
Application of Jensen’s Equality Advanced Machine Learning, Fall 2017 19
Variational EM Algorithm Advanced Machine Learning, Fall 2017 20
Variational EM Algorithm E-Step The multinomial update is akin to Bayesian update of the parameters after observing the words Advanced Machine Learning, Fall 2017 21
Variational EM Algorithm M-Step Advanced Machine Learning, Fall 2017 22
LDA Smoothing Ø Large vocabulary size leads to sparsity issues Ø New documents might contain none of the words observed in the training documents Ø Max. likelihood estimates of multinomial parameters assign 0 probability to new documents Advanced Machine Learning, Fall 2017 23
LDA Smoothing Advanced Machine Learning, Fall 2017 24
Applications Document Modeling Advanced Machine Learning, Fall 2017 25
Applications Document Modeling Advanced Machine Learning, Fall 2017 26
LDA applied to document modeling
Applications Document Classification Ø Train separate models for each class with LDA to obtain generative models for classification Ø Use LDA for dimensionality reduction before applying a machine learning algorithm (e. g. SVM) Advanced Machine Learning, Fall 2017 28
Applications Collaborative Filtering Making automatic predictions (filtering) about the interests of a user by collecting preferences or taste information from many users (collaborating). Advanced Machine Learning, Fall 2017 29
Applications Collaborative Filtering Advanced Machine Learning, Fall 2017 30
- Slides: 31