Joint SentimentTopic Model for Sentiment Analysis Chenghua Lin

Main Idea This paper proposes a novel probabilistic modeling framework based on Latent Dirichlet

Related Works • Sentiment classification based on Machine Learning (e. g. supervised) requires a

JST model • 1. Fully unsupervised. No need for human annotation • 2. Detect

LDA vs. JST LDA JST • Two Matrices: • Three Matrices: – D ×

Incorporating Model Priors • One of the directions for improving the sentiment detection accuracy

Experiment • Sentiment Classification: Only consider two sentiment labels, i. e. positive or negative

Summary 1 • 1. Classification performance of JST is very close to the best

Summary 2: • Manually examining the data reveals that the terms that seem not

Conclusion • 1. presented a joint sentiment/topic (JST) model which can detect document level

Slides: 15

Download presentation

Joint Sentiment/Topic Model for Sentiment Analysis Chenghua Lin & Yulan He CIKM 09

Main Idea This paper proposes a novel probabilistic modeling framework based on Latent Dirichlet Allocation (LDA), called joint sentiment/topic model (JST), which detects sentiment and topic simultaneously from text.

Related Works • Sentiment classification based on Machine Learning (e. g. supervised) requires a large amount of human annotation • Sentiment classification model trained in one domain cannot work well in another domain • topic/feature detection and sentiment classification are often performed separately, which ignores their mutual dependence. – e. g. ‘unpredictable steering’: • Negative in automobile review • Positive in movie review

JST model • 1. Fully unsupervised. No need for human annotation • 2. Detect sentiment/topic simultaneously by considering their mutual relation

LDA vs. JST LDA JST • Two Matrices: • Three Matrices: – D × T distribution: θ – T × W distribution: φ – D × S distribution: π – D × S × T distribution: θ – D × S × W distribution: φ

LDA vs. JST (cont. ) LDA JST

Process of JST

Incorporating Model Priors • One of the directions for improving the sentiment detection accuracy is to incorporate prior information or subjectivity lexicon (i. e. , words bearing positive or negative polarity), which can be obtained in many different ways. – Paradigm word list – Mutual information – Full subjectivity lexicon – Filtered subjectivity lexicon

Experiment • Sentiment Classification: Only consider two sentiment labels, i. e. positive or negative • Topic Extraction

Sentiment Classification

Sentiment Classification (cont. )

Summary 1 • 1. Classification performance of JST is very close to the best performance of ML but save a lot of annotation work. • 2. topic information indeed helps in sentiment classification as the JST model with the mixture of topics consistently outperforms a simple LDA model ignoring the mixture of topics.

Topic Extraction

Summary 2: • Manually examining the data reveals that the terms that seem not conveying sentiments under these two topics in fact appear in the context expressing positive sentiments. The above analysis illustrates the effectiveness of JST in extracting mixture of topics from a corpus.

Conclusion • 1. presented a joint sentiment/topic (JST) model which can detect document level sentiment and extract mixture of topics from text simultaneously. • 2. fully unsupervised, thus provides more flexibilities and can be easier adapted to other domain. • 3. yield competitive performance in document level sentiment classification compared other existing supervised approaches • 4. discovered topics that corresponds to positive/negative sentiment