Topic Significance Ranking for LDA Generative Models Loulwah
- Slides: 17
Topic Significance Ranking for LDA Generative Models Loulwah Al. Sumait James Gentle Daniel Barbará Carlotta Domeniconi ECML PKDD - Bled, Slovenia - September 7 -11, 2009
Agenda p Introduction p Junk/Insignificant topic definitions p Distance measures p 4 -phase Weighted Combination Approach p Experimental results p Conclusions and future work 2
Latent Dirichlet Allocation (LDA) Model Blei, Ng, & Jordan (2003) p p o p p Approximation approaches Input: K Output: Φ, θ d zi Generative Process p Inference Process p Probabilistic generative model Hidden variables (topics) are associated with the observed text Dirichlet priors on document and topic distributions Exact inference is intractable K wi Nd D 3
Topic Significance Ranking p Critical effect of the setting of K on the inferred topics p Most of previous work manually examine the topics p Quantify the semantic significance of topics n How much different is the topic distribution from junk/insignificant topic distributions 4
Topic Significance Ranking p Example: 20 News. Group The Volgenau School of Information Technology and Engineering Department of Computer Science 5
Junk/Insignificant Topic Definitions p Uniform Distribution Over Words n n p Vacuous Semantic Distribution n n p Uniformity of a topic: , p(wi|k) = ik , Vacuousness of a topic: Background Distribution n n Background of a topic: , 6
Distance Measures p Symmetric KL-Divergence n n p Uniformity, Background, W-Vacuous Cosine Dissimilarity n n p Uniformity , W-Vacuous , Background Coefficient Correlation n n Uniformity , W-Vacuous , Background 7
Topic Significance Ranking Multi-Criteria Weighted Combination p 4 phases p n Standardization procedure p Transfer distances into standardized measures § Scores § Weights 8
Topic Significance Ranking 4 phases (Continued) n Intra-Criterion Weighted Combination p Combine standardized measures of each J/I definition Uniformity scores W-Vacuous scores S 1 U k n S 1 Vk S 2 Vk S 1 Bk Inter-Criteria Weighted Combination p n S 2 U k Background scores S 2 Bk Combine J/I scores and weights Topic Rank TSR X p 9
Experimental Results: Simulated Data 10
20 News. Groups Top 10 significant topics 11
20 News. Groups Lowest 10 significant topics 12
NIPS Top 10 Significant Topics 13
NIPS Lowest 10 Significant Topics 14
Individual vs. Combined Score Simulated Data 15
Individual vs. Combined Score 20 News. Groups 16
Conclusions and Future Work Unsupervised numerical quantification of the topics’ semantic Significance p Novel post analysis in LDA modeling p Three J/I topic distributions p 4 levels of weighted combination approach p Future directions: p n n n Analysis of TSR sensitivity to the approach, K and weights settings More J/I definitions Tool to visualize topic evolution in online setting 17
- Lda generative model
- Dd of cherry red spot
- Causes of cerebellar dysfunction
- A note on the evaluation of generative models
- Dilan gorur
- On unifying deep generative models
- Taxonomy of generative models
- Generative vs discriminative
- "sem rush" "ranking factor" or "ranking factors"
- Flyus
- Qolda yoyli payvandlash
- Lsi vs lda
- Yoyli payvandlash
- Berttopic
- Pca lda
- Trocharize
- Lda machine learning python
- Lda