Joint SentimentTopic Model for Sentiment Analysis Chenghua Lin

  • Slides: 15
Download presentation
Joint Sentiment/Topic Model for Sentiment Analysis Chenghua Lin & Yulan He CIKM 09

Joint Sentiment/Topic Model for Sentiment Analysis Chenghua Lin & Yulan He CIKM 09

Main Idea This paper proposes a novel probabilistic modeling framework based on Latent Dirichlet

Main Idea This paper proposes a novel probabilistic modeling framework based on Latent Dirichlet Allocation (LDA), called joint sentiment/topic model (JST), which detects sentiment and topic simultaneously from text.

Related Works • Sentiment classification based on Machine Learning (e. g. supervised) requires a

Related Works • Sentiment classification based on Machine Learning (e. g. supervised) requires a large amount of human annotation • Sentiment classification model trained in one domain cannot work well in another domain • topic/feature detection and sentiment classification are often performed separately, which ignores their mutual dependence. – e. g. ‘unpredictable steering’: • Negative in automobile review • Positive in movie review

JST model • 1. Fully unsupervised. No need for human annotation • 2. Detect

JST model • 1. Fully unsupervised. No need for human annotation • 2. Detect sentiment/topic simultaneously by considering their mutual relation

LDA vs. JST LDA JST • Two Matrices: • Three Matrices: – D ×

LDA vs. JST LDA JST • Two Matrices: • Three Matrices: – D × T distribution: θ – T × W distribution: φ – D × S distribution: π – D × S × T distribution: θ – D × S × W distribution: φ

LDA vs. JST (cont. ) LDA JST

LDA vs. JST (cont. ) LDA JST

Process of JST

Process of JST

Incorporating Model Priors • One of the directions for improving the sentiment detection accuracy

Incorporating Model Priors • One of the directions for improving the sentiment detection accuracy is to incorporate prior information or subjectivity lexicon (i. e. , words bearing positive or negative polarity), which can be obtained in many different ways. – Paradigm word list – Mutual information – Full subjectivity lexicon – Filtered subjectivity lexicon

Experiment • Sentiment Classification: Only consider two sentiment labels, i. e. positive or negative

Experiment • Sentiment Classification: Only consider two sentiment labels, i. e. positive or negative • Topic Extraction

Sentiment Classification

Sentiment Classification

Sentiment Classification (cont. )

Sentiment Classification (cont. )

Summary 1 • 1. Classification performance of JST is very close to the best

Summary 1 • 1. Classification performance of JST is very close to the best performance of ML but save a lot of annotation work. • 2. topic information indeed helps in sentiment classification as the JST model with the mixture of topics consistently outperforms a simple LDA model ignoring the mixture of topics.

Topic Extraction

Topic Extraction

Summary 2: • Manually examining the data reveals that the terms that seem not

Summary 2: • Manually examining the data reveals that the terms that seem not conveying sentiments under these two topics in fact appear in the context expressing positive sentiments. The above analysis illustrates the effectiveness of JST in extracting mixture of topics from a corpus.

Conclusion • 1. presented a joint sentiment/topic (JST) model which can detect document level

Conclusion • 1. presented a joint sentiment/topic (JST) model which can detect document level sentiment and extract mixture of topics from text simultaneously. • 2. fully unsupervised, thus provides more flexibilities and can be easier adapted to other domain. • 3. yield competitive performance in document level sentiment classification compared other existing supervised approaches • 4. discovered topics that corresponds to positive/negative sentiment