Sentiment Analysis and Opinion Mining An introduction Introduction

  • Slides: 69
Download presentation
Sentiment Analysis and Opinion Mining An introduction

Sentiment Analysis and Opinion Mining An introduction

Introduction n Sentiment analysis (SA) or opinion mining q n computational study of opinion,

Introduction n Sentiment analysis (SA) or opinion mining q n computational study of opinion, sentiment, appraisal, evaluation, and emotion. Why is it important? q Opinions are key influencers of our behaviors. n n q Our beliefs and perceptions of reality are conditioned on how others see the world. Whenever we need to make a decision we often seek out the opinions from others. Rise of social media –> opinion data Rise of AI and chatbots: n Emotion and sentiment are key to human communication 2

Terms defined - Merriam-Webster n Sentiment: an attitude, thought, or judgment prompted by feeling.

Terms defined - Merriam-Webster n Sentiment: an attitude, thought, or judgment prompted by feeling. q q n A sentiment is more of a feeling. “I am concerned about the current state of the economy. ” Opinion: a view, judgment, or appraisal formed in the mind about a particular matter. q q a concrete view of a person about something. “I think the economy is not doing well. ” 3

SA: A fascinating problem! n Intellectually challenging & many applications. q q q A

SA: A fascinating problem! n Intellectually challenging & many applications. q q q A popular research area in NLP, and data mining spread from CS to management and social sciences A large number of companies in the space globally n n It touches every aspect of NLP & is also confined. q n > 300 in the US alone: every company that does text analysis A “simple” semantic analysis problem. A major technology from NLP. q q But it is very hard. An very active research area 4

Roadmap n n n Sentiment analysis problem Document and sentence level sentiment classification Aspect-based

Roadmap n n n Sentiment analysis problem Document and sentence level sentiment classification Aspect-based sentiment analysis q q n n Aspect extraction Aspect sentiment classification Mining comparative opinions Opinion lexicon generation Some interesting sentences Summary 5

Two main types of opinions (Jindal and Liu 2006; Liu, 2010) n Regular opinions:

Two main types of opinions (Jindal and Liu 2006; Liu, 2010) n Regular opinions: Sentiment/opinion expressions on some target entities q Direct opinions: n q Indirect opinions: n n “After taking the drug, my pain has gone. ” Comparative opinions: Comparison of more than one entity. q n “The touch screen is really cool. ” E. g. , “i. Phone is better than Blackberry. ” We focus on regular opinions first, and just call them opinions. 6

(I): Definition of opinion n Id: Abc 123 on 5 -1 -2008 -- “I

(I): Definition of opinion n Id: Abc 123 on 5 -1 -2008 -- “I bought an i. Phone yesterday. It is such a nice phone. The touch screen is really cool. The voice quality is great too. It is much better than my Blackberry. However, my mom was mad with me as I didn’t tell her before I bought the phone. She thought the phone was too expensive” n Definition: An opinion is a quadruple (Liu, 2012), (target, sentiment, holder, time) n This definition is concise, but not easy to use. q Target can be complex, e. g. , “I bought an i. Phone. The voice quality is amazing. ” n Target = voice quality? (not quite) 7

A more practical definition (Hu and Liu 2004; Liu, 2010, 2012) n An opinion

A more practical definition (Hu and Liu 2004; Liu, 2010, 2012) n An opinion is a quintuple (entity, aspect, sentiment, holder, time) where q q q n entity: target entity (or object). Aspect: aspect (or feature) of the entity. Sentiment: +, -, or neu, a rating, or an emotion. holder: opinion holder. time: time when the opinion was expressed. Aspect-based sentiment analysis 8

Our example blog in quintuples n n Id: Abc 123 on 5 -1 -2008

Our example blog in quintuples n n Id: Abc 123 on 5 -1 -2008 “I bought an i. Phone a few days ago. It is such a nice phone. The touch screen is really cool. The voice quality is great too. It is much better than my old Blackberry, which was a terrible phone and so difficult to type with its tiny keys. However, my mother was mad with me as I did not tell her before I bought the phone. She also thought the phone was too expensive, …” In quintuples (i. Phone, GENERAL, +, Abc 123, 5 -1 -2008) (i. Phone, touch_screen, +, Abc 123, 5 -1 -2008) …. q We will discuss comparative opinions later. 9

(II): Opinion summary (Hu and Liu 2004) n With a lot of opinions, a

(II): Opinion summary (Hu and Liu 2004) n With a lot of opinions, a summary is necessary. q q n Opinion summary (OS) can be defined precisely, q n not dependent on how summary is generated. Opinion summary needs to be quantitative q n Not traditional text summary: from long to short. Text summarization: defined operationally based on algorithms that perform the task 60% positive is very different from 90% positive. Main form of OS: Aspect-based opinion summary 10

Opinion summary (Hu and Liu, 2004) Aspect/feature Based Summary of opinions about i. Phone:

Opinion summary (Hu and Liu, 2004) Aspect/feature Based Summary of opinions about i. Phone: Aspect: Touch screen Positive: 212 n The touch screen was really cool. n The touch screen was so easy to use and can do amazing things. … Negative: 6 n The screen is easily scratched. n I have a lot of difficulty in removing finger marks from the touch screen. … Aspect: voice quality … (Liu et al. 2005) Opinion Summary of 1 phone n + _ Voice Screen Battery size weight n Opinion comparison of 2 phones + _ 11

Aspect-based opinion summary 12

Aspect-based opinion summary 12

Emotion n No agreed set of basic emotions of people. q Based on Parrott

Emotion n No agreed set of basic emotions of people. q Based on Parrott (2001), people have six basic emotions, n n love, joy, surprise, anger, sadness, and fear. Although related, emotions and opinions are not equivalent. q Opinion: rational (+/-) view on something/target n q Cannot say “I like” Emotion: focusing on an inner feeling n Can say “I am angry. ” or “There is sadness in her eyes” 13

14

14

Definition of Emotion n Definition (Emotion): It is a quintuple, (entity, aspect, emotion_type, feeler,

Definition of Emotion n Definition (Emotion): It is a quintuple, (entity, aspect, emotion_type, feeler, time) q E. g. , “I am so mad with the hotel manager because he refused to refund my booking fee” n n n q Entity: hotel Aspect: manager emotion_type anger feeler: I time: unknown The definition can also include the cause. 15

Emotion Expressions n 1. use emotion or mood words or phrases q n 2.

Emotion Expressions n 1. use emotion or mood words or phrases q n 2. describe emotion-related behaviors, q q n E. g. , love, disgust, angry, and upset “He cried after he saw his mother” and “After he received the news, he jumped up and down for a few minutes like a small boy. ” 3. use intensifiers: Common English intensifiers include q very, so, extremely, dreadfully, really, awfully, etc. 16

Emotion Expressions (conted. ) n 4. use superlatives – q n n many superlative

Emotion Expressions (conted. ) n 4. use superlatives – q n n many superlative expressions also express emotions, for example, “This car is simply the best” 5. use pejorative (e. g. , “He is a fascist. ”), laudatory (e. g. , “He is a saint. ”), and sarcastic expressions (e. g. , “What a great car, it broke the second day”) 6. use swearing, cursing, insulting, blaming, accusing, and threatening expressions 17

Roadmap n n n Sentiment analysis problem Document and sentence level sentiment classification Aspect-based

Roadmap n n n Sentiment analysis problem Document and sentence level sentiment classification Aspect-based sentiment analysis q q n n Aspect extraction Aspect sentiment classification Mining comparative opinions Opinion lexicon generation Some interesting sentences Summary 18

Document sentiment classification n Classify a whole opinion document (e. g. , a review)

Document sentiment classification n Classify a whole opinion document (e. g. , a review) based on the overall sentiment of the opinion holder (Pang et al 2002; Turney 2002) q n An example review: q q n Classes: Positive, negative (possibly neutral) “I bought an i. Phone a few days ago. It is such a nice phone, although a little large. The touch screen is cool. The voice quality is great too. I simply love it!” Classification: positive or negative? It is basically a text classification problem 19

Sentence sentiment analysis n Classify the sentiment expressed in a sentence q q Classes:

Sentence sentiment analysis n Classify the sentiment expressed in a sentence q q Classes: positive, negative, neutral Neutral means no sentiment expressed n n n ”I believe he went home yesterday. ” “I bought a i. Phone yesterday” But bear in mind q q Explicit opinion: “I like this car. ” Fact-implied opinion: “I bought this car yesterday and it broke today. ” q Mixed opinion: “Apple is doing well in this poor economy” 20

Classification approaches n Supervised learning methods to classify reviews into positive and negative. q

Classification approaches n Supervised learning methods to classify reviews into positive and negative. q Early research n q Recent research n n Naïve Bayes, Maximum Entropy, Support Vector Machines (SVM), etc Deep learning Unsupervised methods q Lexicon-based methods n Using sentiment words and phrases: good, wonderful, awesome, troublesome, cost an arm and leg, … 21

Features for supervised learning n n The problem has been studied by numerous researchers.

Features for supervised learning n n The problem has been studied by numerous researchers. Key: feature engineering. A large set of features have been tried by researchers. E. g. , q q q Terms frequency and different IR weighting schemes Part of speech (POS) tags Opinion words and phrases Negations Syntactic dependency 22

Roadmap n n n Sentiment analysis problem Document and sentence level sentiment classification Aspect-based

Roadmap n n n Sentiment analysis problem Document and sentence level sentiment classification Aspect-based sentiment analysis q q n n Aspect extraction Aspect sentiment classification Mining comparative opinions Opinion lexicon generation Some interesting sentences Summary 23

We need to go further n Sentiment classification at both the document and sentence

We need to go further n Sentiment classification at both the document and sentence (or clause) levels are useful, but q q They do not find what people liked and disliked. I. e. , they do not identify the targets of opinions, i. e. , n n n Entities and their aspects Without knowing targets, opinions are of limited use. We need to go to the entity and aspect level. q q Aspect-based opinion mining and summarization. We thus need the full opinion definition. 24

Recall the opinion definition n An opinion is a quintuple (entity, aspect, sentiment, holder,

Recall the opinion definition n An opinion is a quintuple (entity, aspect, sentiment, holder, time) where q q q n entity: target entity (or object). Aspect: aspect (or feature) of the entity. Sentiment: +, -, or neu, a rating, or an emotion. holder: opinion holder. time: time when the opinion was expressed. Aspect-based sentiment analysis 25

Roadmap n n n Sentiment analysis problem Document and sentence level sentiment classification Aspect-based

Roadmap n n n Sentiment analysis problem Document and sentence level sentiment classification Aspect-based sentiment analysis q q n n Aspect extraction Aspect sentiment classification Mining comparative opinions Opinion lexicon generation Some interesting sentences Summary 26

Aspect extraction n Goal: Given an opinion corpus, extract all aspects n Four main

Aspect extraction n Goal: Given an opinion corpus, extract all aspects n Four main approaches: q q (1) Finding frequent nouns and noun phrases (2) Exploiting opinion and target relations (3) Supervised learning (4) Topic modeling 27

(1) Frequent nouns and noun phrases (Hu and Liu 2004) n n Nouns (NN)

(1) Frequent nouns and noun phrases (Hu and Liu 2004) n n Nouns (NN) that are frequently mentioned are likely to be true aspects (frequent aspects). Why? q q q Most aspects are nouns or noun phrases When product aspects/features are discussed, the words they use often converge. Those frequent ones are usually the main aspects that people are interested in. 28

Using part-of relationship and the Web (Popescu and Etzioni, 2005) n n Improved (Hu

Using part-of relationship and the Web (Popescu and Etzioni, 2005) n n Improved (Hu and Liu, 2004) by removing some frequent noun phrases that may not be aspects. It identifies part-of relationship q q Each noun phrase is given a pointwise mutual information (PMI) score between the phrase and part discriminators associated with the product class, e. g. , a scanner class. E. g. , “of scanner”, “scanner has”, etc, which are used to find parts of scanners by searching on the Web: 29

(2) Exploiting opinion & target relation n Key idea: opinions have targets, i. e.

(2) Exploiting opinion & target relation n Key idea: opinions have targets, i. e. , opinion terms are used to modify aspects and entities. q q n n “The pictures are absolutely amazing. ” “This is an amazing software. ” The syntactic relation is approximated with the nearest noun phrases to the opinion word in (Hu and Liu 2004). The idea was generalized to q q syntactic dependency in (Zhuang et al 2006) double propagation in (Qiu et al 2009). A similar idea also in (Wang and Wang 2008) 30

Extract aspects using DP (Qiu et al. 2009; 2011) n Double propagation (DP) q

Extract aspects using DP (Qiu et al. 2009; 2011) n Double propagation (DP) q n Use dependency of opinions & aspects to extract both aspects & opinion words. q q n Based on the definition earlier, an opinion should have a target, entity or aspect. Knowing one helps find the other. E. g. , “The rooms are spacious” It extracts both aspects and opinion words. q A domain independent method. 31

The DP method n DP is a bootstrapping method q q n Input: a

The DP method n DP is a bootstrapping method q q n Input: a set of seed opinion words, no aspect seeds needed Based on dependency grammar (Tesniere 1959). q “This phone has good screen” 32

Rules from dependency grammar 33

Rules from dependency grammar 33

Explicit and implicit aspects (Hu and Liu, 2004) n Explicit aspects: Aspects explicitly mentioned

Explicit and implicit aspects (Hu and Liu, 2004) n Explicit aspects: Aspects explicitly mentioned as nouns or noun phrases in a sentence q n Implicit aspects: Aspects not explicitly mentioned in a sentence but are implied q q q n “The picture quality is of this phone is great. ” “This car is so expensive. ” “This phone will not easily fit in a pocket. ” “Included 16 MB is stingy. ” Some work has been done (Su et al. 2009; Hai et al 2011) 34

(3) Using supervised learning n Using sequence labeling methods such as q q q

(3) Using supervised learning n Using sequence labeling methods such as q q q n Hidden Markov Models (HMM) (Jin and Ho, 2009) Conditional Random Fields (Jakob and Gurevych, 2010). Other supervised or partially supervised learning. Recently, there a large number of deep learning-based approaches. 35

(4) Topic Modeling n Aspect extraction has two tasks: q q n (1) extract

(4) Topic Modeling n Aspect extraction has two tasks: q q n (1) extract aspect expressions (2) cluster them (same: “picture, ” “photo, ” “image”) Top models such as p. LSA (Hofmann 1999) and LDA (Blei et al 2003) perform both tasks at the same time. A topic is basically an aspect. q q A document is a distribution over topics A topic is a distribution over terms/words, e. g. , n {price, cost, cheap, expensive, …} q Ranked based on probabilities (not shown). 36

Many Related Models and Papers n n n Use topic models to model aspects.

Many Related Models and Papers n n n Use topic models to model aspects. Jointly model both aspects and sentiments Knowledge-based modeling: Unsupervised models are often insufficient q q Not producing coherent topics/aspects To tackle the problem, knowledge-based topic models have been proposed n n Guided by user-specified prior domain knowledge. Seed terms or constraints 37

Roadmap n n n Sentiment analysis problem Document and sentence level sentiment classification Aspect-based

Roadmap n n n Sentiment analysis problem Document and sentence level sentiment classification Aspect-based sentiment analysis q q n n Aspect extraction Aspect sentiment classification Mining comparative opinions Opinion lexicon generation Some interesting sentences Summary 38

Aspect sentiment classification n n For each aspect, identify the sentiment about it Work

Aspect sentiment classification n n For each aspect, identify the sentiment about it Work based on sentences, but also consider, q q n A sentence can have multiple aspects with different opinions. E. g. , The battery life and picture quality are great (+), but the view founder is small (-). Almost all approaches make use of opinion words and phrases. But notice: q q Some opinion words have context independent orientations, e. g. , “good” and “bad” (almost) Some other words have context dependent orientations, e. g. , “long, ” “quiet, ” and “sucks” (+ve for vacuum cleaner) 39

Aspect sentiment classification “Apple is doing very well in this poor economy” n Lexicon-based

Aspect sentiment classification “Apple is doing very well in this poor economy” n Lexicon-based approach: Opinion words/phrases q n Parsing: simple sentences, compound sentences, conditional sentences, questions, modality verb tenses, etc (Liu, 2015). Supervised learning is tricky: q q q Feature weighting: consider distance between word and target entity/aspect (e. g. , Boiy and Moens, 2009) Use a parse tree to generate a set of target dependent features (e. g. , Jiang et al. 2011) New approaches are all based on deep learning. 40

Sentiment shifters (e. g. , Polanyi and Zaenen 2004) n Sentiment/opinion shifters (also called

Sentiment shifters (e. g. , Polanyi and Zaenen 2004) n Sentiment/opinion shifters (also called valence shifters are words and phrases that can shift or change opinion orientations. q q Negation words like not, never, cannot, etc. , are the most common type. Many other words and phrases can also alter opinion orientations. E. g. , modal auxiliary verbs (e. g. , would, should, could, etc) n n “The brake could be improved. ” Very complicated, see (Liu, 2015) 41

Sentiment shifters (contd) n Some presuppositional items can change opinions too, e. g. ,

Sentiment shifters (contd) n Some presuppositional items can change opinions too, e. g. , barely and hardly q q n Words like fail, omit, neglect behave similarly, q n “This camera fails to impress me. ” Sarcasm changes orientation too q n “It hardly works. ” (comparing to “it works”) It presupposes that better was expected. “What a great car, it did not start the first day. ” Jia, Yu and Meng (2009) designed some rules based on parsing to find the scope of negation. 42

Basic rules of opinions (Liu, 2010; 2012) n Opinions/sentiments are governed by many rules,

Basic rules of opinions (Liu, 2010; 2012) n Opinions/sentiments are governed by many rules, e. g. , (many such rules) q Opinion word or phrase: “I love this car” P N q : : = a positive opinion word or phrase an negative opinion word or phrase Desirable or undesirable facts: “After my wife and I slept on it for two weeks, I noticed a mountain in the middle of the mattress” P N : : = desirable fact undesirable fact 43

Basic rules of opinions q High, low, increased and decreased quantity of a positive

Basic rules of opinions q High, low, increased and decreased quantity of a positive potential item (PPI) or negative potential item (NPI): “The battery life is long. ” P : : = | NPI : : = PPI : : = no, low, less or decreased quantity of NPI large, larger, or increased quantity of PPI no, low, less, or decreased quantity of PPI large, larger, or increased quantity of NPI a negative potential item a positive potential item 44

Basic rules of opinions q Decreased and increased quantity of an opinionated item: “This

Basic rules of opinions q Decreased and increased quantity of an opinionated item: “This drug reduced my pain significantly. ” P N q : : = | less or decreased N more or increased P less or decreased P more or increased N Deviation from the desired value range: “This drug increased my blood pressure to 200. ” P N : : = within the desired value range : : = above or below the desired value range 45

Basic rules of opinions q Producing and consuming resources and wastes: “This washer uses

Basic rules of opinions q Producing and consuming resources and wastes: “This washer uses a lot of water” P N : : = | | | produce a large quantity of or more resource produce no, little or less waste consume no, little or less resource consume a large quantity of or more waste produce no, little or less resource produce some or more waste consume a large quantity of or more resource consume no, little or less waste 46

Roadmap n n n Sentiment analysis problem Document and sentence level sentiment classification Aspect-based

Roadmap n n n Sentiment analysis problem Document and sentence level sentiment classification Aspect-based sentiment analysis q q n n Aspect extraction Aspect sentiment classification Mining comparative opinions Opinion lexicon generation Some interesting sentences Summary 47

Comparative Opinions (Jindal and Liu, 2006) n Gradable q Non-Equal Gradable: Relations of the

Comparative Opinions (Jindal and Liu, 2006) n Gradable q Non-Equal Gradable: Relations of the type greater or less than n q Equative: Relations of the type equal to n q “The sound of phone A is better than that of phone B” “Camera A and camera B both come in 7 MP” Superlative: Relations of the type greater or less than all others n “Camera A is the cheapest in market” 48

Analyzing Comparative Opinions n Objective: Given an opinionated document d, Extract comparative opinions: (E

Analyzing Comparative Opinions n Objective: Given an opinionated document d, Extract comparative opinions: (E 1, E 2, A, po, h, t), E 1 and E 2; entity sets being compared A: their shared aspects - the comparison is based on po: preferred entity set h: opinion holder t: time when the comparative opinion is posted. n Note: not positive or negative opinions. 49

An example n Consider the comparative sentence q q n “Canon’s optics is better

An example n Consider the comparative sentence q q n “Canon’s optics is better than those of Sony and Nikon. ” Written by John in 2010. The extracted comparative opinion/relation: q ({Canon}, {Sony, Nikon}, {optics}, preferred: {Canon}, John, 2010) 50

Common comparatives n n In English, comparatives are usually formed by adding -er and

Common comparatives n n In English, comparatives are usually formed by adding -er and superlatives are formed by adding est to their base adjectives and adverbs Adjectives and adverbs with two syllables or more and not ending in y do not form comparatives or superlatives by adding -er or -est. q n Instead, more, most, less, and least are used before such words, e. g. , more beautiful. Irregular comparatives and superlatives, i. e. , more most, less, least, better, best, worse, worst, etc 51

Some techniques n Identify comparative sentences q n Extraction of different items q q

Some techniques n Identify comparative sentences q n Extraction of different items q q n Supervised learning Conditional random fields (CRF) Deep learning based methods Determine preferred entities (opinions) q q Lexicon-based methods: Parsing and opinion lexicon Deep learning based approach 52

Analysis of comparative opinions n Gradable comparative sentences can be dealt with almost as

Analysis of comparative opinions n Gradable comparative sentences can be dealt with almost as any normal opinion sentences. q E. g. , “optics of camera A is better than that of camera B” q q n Positive: (camera A, optics) Negative: (camera B, optics) Difficulty: recognize non-standard comparatives q E. g. , “I am so happy because my new i. Phone is nothing like my old slow ugly Droid. ” 53

Identifying preferred entities n The following rules can be applied Comparative Negative : :

Identifying preferred entities n The following rules can be applied Comparative Negative : : = increasing comparative N | decreasing comparative P Comparative Positive : : = increasing comparative P | decreasing comparative N q E. g. , “Coke tastes better than Pepsi” q “Nokia phone’s battery life is longer than Moto phone” n Context-dependent comparative opinion words q q Using context pair: (aspect, JJ/JJR) Deciding the polarity of (battery_life, longer) in a corpus 54

Roadmap n n n Sentiment analysis problem Document and sentence level sentiment classification Aspect-based

Roadmap n n n Sentiment analysis problem Document and sentence level sentiment classification Aspect-based sentiment analysis q q n n Aspect extraction Aspect sentiment classification Mining comparative opinions Opinion lexicon generation Some interesting sentences Summary 55

Sentiment (or opinion) lexicon n Sentiment lexicon: lists of words and phrases used to

Sentiment (or opinion) lexicon n Sentiment lexicon: lists of words and phrases used to express people’s subjective feelings and sentiments/opinions. q q q n Positive: beautiful, wonderful, good, amazing, Negative: bad, poor, terrible, cost an arm and leg. They are instrumental for sentiment analysis. Three main ways to compile such lists: q q q Manual approach: not a bad idea, only a one-time effort Corpus-based approach Dictionary-based approach 56

Corpus-based approaches n Sentiment consistency: Given a set of seed sentiment words (e. g.

Corpus-based approaches n Sentiment consistency: Given a set of seed sentiment words (e. g. , good, beautiful, bad, terrible) and domain corpus, use conventions on connectives to identify opinion words (Hazivassiloglou and Mc. Keown, 1997). E. g. , q Conjunction: conjoined adjectives usually have the same orientation. n q E. g. , “This car is beautiful and spacious. ” (conjunction) AND, OR, BUT, EITHER-OR, and NEITHER-NOR have similar constraints. 57

Context dependent opinion n Find domain opinion words is insufficient. A word may indicate

Context dependent opinion n Find domain opinion words is insufficient. A word may indicate different opinions in same domain. q n “The battery life is long” (+) and “It takes a long time to focus” (-). Ding, Liu and Yu (2008) exploited sentiment consistency (both inter and intra sentence) based on contexts q q q It finds context dependent opinions. Context: (adjective, aspect), e. g. , (long, battery_life) It assigns an opinion orientation/polarity to the pair. 58

The Double Propagation method (Qiu et al 2009, 2011) n n The same DP

The Double Propagation method (Qiu et al 2009, 2011) n n The same DP method can also use dependency of opinions & aspects to extract new opinion words. Based on dependency relations q Knowing an aspect can find the opinion word that modifies it n q E. g. , “The rooms are spacious” Knowing some opinion words can find more opinion words n E. g. , “The rooms are spacious and beautiful” 59

Opinions implied by objective terms n n Most sentiment/opinion words are “subjective words, ”

Opinions implied by objective terms n n Most sentiment/opinion words are “subjective words, ” e. g. , good, bad, hate, and love. But objective nouns can imply opinions too. q n E. g. , “After sleeping on the mattress for one month, a valley/body impression has formed in the middle. ” Resource usage descriptions may also imply opinions (as mentioned in rules of opinions) q E. g. , “This washer uses a lot of water. ” 60

Dictionary-based methods n Typically use Word. Net’s synsets and hierarchies to acquire opinion words

Dictionary-based methods n Typically use Word. Net’s synsets and hierarchies to acquire opinion words q q n Start with a small seed set of sentiment/opinion words. Bootstrap the set by searching for synonyms and antonyms in Word. Net iteratively (Hu and Liu, 2004; Kim and Hovy, 2004; Kamps et al 2004). Use additional information (e. g. , glosses) from Word. Net (Andreevskaia and Bergler, 2006) and learning (Esuti and Sebastiani, 2005). (Dragut et al 2010) uses a set of rules to infer orientations. 61

Roadmap n n n Sentiment analysis problem Document and sentence level sentiment classification Aspect-based

Roadmap n n n Sentiment analysis problem Document and sentence level sentiment classification Aspect-based sentiment analysis q q n n Aspect extraction Aspect sentiment classification Mining comparative opinions Opinion lexicon generation Some interesting sentences Summary 62

Some interesting sentences n n “Trying out Chrome because Firefox keeps crashing. ” q

Some interesting sentences n n “Trying out Chrome because Firefox keeps crashing. ” q Firefox - negative; no opinion about chrome. q We need to segment the sentence into clauses to decide that “crashing” only applies to Firefox(? ). But how about these q q q “I changed to Audi because BMW is so expensive. ” “I did not buy BWM because of the high price. ” “I am so happy that my i. Phone is nothing like my old ugly Droid. ” 63

Some interesting sentences (contd) n These two sentences are from paint reviews. q q

Some interesting sentences (contd) n These two sentences are from paint reviews. q q q n n n “For paint. X, one coat can cover the wood color. ” “For paint. Y, we need three coats to cover the wood color We know that paint. X is good and paint. Y is not, but how, by a system. “My goal is to get a tv with good picture quality” “The top of the picture was brighter than the bottom. ” “When I first got the airbed a couple of weeks ago it was wonderful as all new things are, however as the weeks progressed I liked it less and less. ” 64

Some interesting sentences (contd) n Conditional sentences are hard to deal with (Narayanan et

Some interesting sentences (contd) n Conditional sentences are hard to deal with (Narayanan et al. 2009) q q “If I can find a good camera, I will buy it. ” But conditional sentences can have opinions n n “If you are looking for a good phone, buy Nokia” Questions are also hard to handle q q “Are there any great perks for employees? ” “Any idea how to fix this lousy Sony camera? ” 65

Some interesting sentences (contd) n Sarcastic sentences q n “What a great car, it

Some interesting sentences (contd) n Sarcastic sentences q n “What a great car, it stopped working in the second day. ” Sarcastic sentences are common in political blogs, comments and discussions. q They make political opinions difficult to handle 66

Opinion mining is hard! n “This past Saturday, I bought a Nokia phone and

Opinion mining is hard! n “This past Saturday, I bought a Nokia phone and my girlfriend bought a Motorola phone with Bluetooth. We called each other when we got home. The voice on my phone was not so clear, worse than my previous Samsung phone. The battery life was short too. My girlfriend was quite happy with her phone. I wanted a phone with good sound quality. So my purchase was a real disappointment. I returned the phone yesterday. ” 67

Roadmap n n n Sentiment analysis problem Document and sentence level sentiment classification Aspect-based

Roadmap n n n Sentiment analysis problem Document and sentence level sentiment classification Aspect-based sentiment analysis q q n n Aspect extraction Aspect sentiment classification Mining comparative opinions Opinion lexicon generation Some interesting sentences Summary 68

Summary n n This chapter introduced the problem of sentiment analysis or opinion mining.

Summary n n This chapter introduced the problem of sentiment analysis or opinion mining. It is a fascinating NLP or text mining problem. q q q n Every sub-problem is highly challenging. But it is also restricted (semantically). A very active research area in NLP. Despite the challenges, applications are flourishing! q Useful to almost every organization and individual. 69