The US 2010 Elections on Twitter Avishay Livne

  • Slides: 31
Download presentation
The US 2010 Elections on Twitter Avishay Livne , Matt Simmons, Abe Gong, Eytan

The US 2010 Elections on Twitter Avishay Livne , Matt Simmons, Abe Gong, Eytan Adar, Lada Adamic University of Michigan Political Networks 2011

Agenda • • Background & Data Network analysis Content analysis Combined analysis

Agenda • • Background & Data Network analysis Content analysis Combined analysis

Anatomy of a Tweet reply/mention Broadcast to followers 140 characters (clickable) hashtags

Anatomy of a Tweet reply/mention Broadcast to followers 140 characters (clickable) hashtags

Data • 687 manually filtered users, 4429 edges. – 339 Democrats, 348 Republicans, 95

Data • 687 manually filtered users, 4429 edges. – 339 Democrats, 348 Republicans, 95 Tea Party – 50% of candidates (house, senate, gubernatorial) • 460, 038 tweets over 2 years + 233, 296 URLs.

Usage Analysis Democrats Republicans Tea Party 632 961 1258 tweets /day 2. 66 2.

Usage Analysis Democrats Republicans Tea Party 632 961 1258 tweets /day 2. 66 2. 97 5. 21 40 52. 3 82. 6 172. 6 260. 5 472. 7 #hashtags 196 404 753 #/tweet 0. 37 0. 54 0. 68 tweets replies

Graph Analysis Edge(user 1, user 2) = user 1 follows user 2 Democrats Republicans

Graph Analysis Edge(user 1, user 2) = user 1 follows user 2 Democrats Republicans Tea. Party

Graph Analysis Democrats Republicans Tea. Party Democratic Republican Tea Party Republican + TP Density

Graph Analysis Democrats Republicans Tea. Party Democratic Republican Tea Party Republican + TP Density 0. 007 0. 032 0. 020 0. 025 In-degree 2. 55 8. 37 1. 82 8. 97 Supports previous studies [Adamic & Glance ‘ 05]

KL Divergence Extracting meaningful terms = terms that contribute the most for distinguishing the

KL Divergence Extracting meaningful terms = terms that contribute the most for distinguishing the user’s LM

Divergent terms Democrat Republican Tea Party education spending barney_frank jobs bills conservative oil_spill budget

Divergent terms Democrat Republican Tea Party education spending barney_frank jobs bills conservative oil_spill budget tea_party clean_energy wsj (wall street journal) clinton afghanistan bush nancy_pelosi reform deficit obamacare

Party Cohesiveness Pairs within party (directly linked or not)

Party Cohesiveness Pairs within party (directly linked or not)

Sentiment Analysis We determined the sentiment of each doc by summing the sentiment value

Sentiment Analysis We determined the sentiment of each doc by summing the sentiment value of each term. “great event in san diego with @carlyforca - i hope all my fellow veterans will support her!” +7 Great = 3 Hope = 2 Support = 2 – Nielsen F. Å. 2011. AFINN, • www 2. imm. dtu. dk/pubdb/views/publication_details. php? id=6010.

Sentiment Analysis We determined the sentiment of each doc by summing the sentiment value

Sentiment Analysis We determined the sentiment of each doc by summing the sentiment value of each term. “great event in san diego with @carlyforca - i hope all my fellow veterans will support her!” +7 “@briandubie wrong again. your most recent claims are outrageous. please stop distorting the truth. #vtgov #vt -4

Sentiment Analysis We determined the sentiment of each doc by summing the sentiment value

Sentiment Analysis We determined the sentiment of each doc by summing the sentiment value of each term. “great event in san diego with @carlyforca - i hope all my fellow veterans will support her!” +7 “@briandubie wrong again. your most recent claims are outrageous. please stop distorting the truth. #vtgov #vt -4 “#hcr will kill small businesses and the jobs they create, 60% of americans knew that but @repmaryjokilroy still doesn’t” -3 Sentiment attached to a term = averaging the sentiment of all documents in which the term appears.

Sentiment Analysis Positive (+) Negative (-) Democratic Republican Tea-Party november food counties trillion care_reform

Sentiment Analysis Positive (+) Negative (-) Democratic Republican Tea-Party november food counties trillion care_reform receive food laws tax_cuts food rating in_november tax_cuts stimulus bush followers his_campaign twitter the_gop tax_cuts department care_bill followers efforts politicians this_country control #followfriday friends goal bush budget fight_for discussion kids students republicans administration budget friends in_november followers social_security democrats democratic our_country family his_campaign budget los_angeles spending twitter women questions reduce recovery stimulus values goal tweets defense bush harry_reid labor our_country team administration #ocra the_government women fellow mention tea_party social_security spend this_election marco this_election commerce obama politicians veterans crist sarah truth harry_reid recovery

Agenda • • Background &Data Network analysis Content analysis Combined analysis

Agenda • • Background &Data Network analysis Content analysis Combined analysis

Graph and Language Space LM Distance vs. Graph Distance (pairs) Degrees of influence? Micro-communities?

Graph and Language Space LM Distance vs. Graph Distance (pairs) Degrees of influence? Micro-communities?

Predicting election results Beta Significance Intercept Tweets -5. 931 *** -. 000827 *** KL

Predicting election results Beta Significance Intercept Tweets -5. 931 *** -. 000827 *** KL divergence from corpus -. 252 ** Incumbent 1. 597 ** Republican 1. 374 ** Tea Party . 605 Closeness (all) 23. 820 *** Closeness (out) -76. 750 *** Same-party incumbent 1. 931 ***

Summary Twitter is prevalent in campaigns Tea party more aggressive and Twitter-savvy Republicans more

Summary Twitter is prevalent in campaigns Tea party more aggressive and Twitter-savvy Republicans more aligned Tea Party and Republicans more cohesive Substantial language use differences (topics, sentiments) • Network and language are correlated • Network and language predict outcomes • • •

Future Work • • More extensive sentiment analysis Ideal points? Better predictions models Integrating

Future Work • • More extensive sentiment analysis Ideal points? Better predictions models Integrating constituents Discourse analysis Primaries …

Thanks More information in the paper Contact info: avishay@umich. edu, agong@umich. edu Thank you!

Thanks More information in the paper Contact info: avishay@umich. edu, agong@umich. edu Thank you!

Twitter is a leading microblogging service with almost 200 M users. It became an

Twitter is a leading microblogging service with almost 200 M users. It became an integral part of any (political) campaign.

Language Modeling 101 Probability distribution over terms (uni- & bi-grams) Document 1=“health care transparency

Language Modeling 101 Probability distribution over terms (uni- & bi-grams) Document 1=“health care transparency fight continue” Document 2=“health care increase bill” P(health|user)=2/9 P(fight|user)=1/9 P(canada|user)=0/9

Language Models 101 LM (Language model) Probability distribution over terms (N-grams) doc 1=“health care

Language Models 101 LM (Language model) Probability distribution over terms (N-grams) doc 1=“health care transparency fight continue” doc 2=“health care increase bill” P(health|user)=2/9 P(fight|user)=1/9 P(canada|user)=0/9 P(health|user)=0. 99*2/9+0. 01*P(health) P(fight|user)=0. 99*1/9+0. 01*P(fight) P(canada|user)=0. 99*0+0. 01*P(canada)

Twitter in politics Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment

Twitter in politics Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment – Tumasjan et al. [ICWSM 2010] Using % of tweets mentioning party P to predict % of votes. German federal elections 2009. On the other hand, Germany used an octopus to predict World Cup ‘ 10 results. . . … correctly predicting 8/8 matches Paul the Octopus

Hashtag Analysis Top Hashtags (#hashtag, occurrences, unique users) Democrat Rep-TP Tea Party p 2,

Hashtag Analysis Top Hashtags (#hashtag, occurrences, unique users) Democrat Rep-TP Tea Party p 2, 4564, 96 tcot, 13347, 169 tcot, 11482, 70 tcot, 3403, 38 gop, 3929, 125 teaparty, 4419, 52 nvsen, 2471, 3 fb, 3882, 45 ar 02, 3762, 2 fb, 1232, 32 nrcc, 2091, 29 alaska, 2372, 1 hcr, 1176, 82 hcr, 1772, 110 gop, 2262, 60

Hashtag Analysis Top Hashtags (#hashtag, occurrences, unique users) Democrat Rep-TP Tea Party p 2,

Hashtag Analysis Top Hashtags (#hashtag, occurrences, unique users) Democrat Rep-TP Tea Party p 2, 4564, 96 tcot, 13347, 169 tcot, 11482, 70 tcot, 3403, 38 gop, 3929, 125 teaparty, 4419, 52 nvsen, 2471, 3 fb, 3882, 45 ar 02, 3762, 2 fb, 1232, 32 nrcc, 2091, 29 alaska, 2372, 1 hcr, 1176, 82 hcr, 1772, 110 gop, 2262, 60 p 2 – “Progressives 2. 0”; fb – “Facebook”; hcr – “Health Care Reform”; nvsen – “Nevada Senator”; ar 02 – “Arkansas District #2”; tcot – “Top Conservatives on Twitter”; gop – “Grand Old Party”; nrcc – “National Republican Congressional Committee”;

Topic Modeling (LDA) • We extracted latent topics using LDA [Blei, Ng, Jordan 2002]

Topic Modeling (LDA) • We extracted latent topics using LDA [Blei, Ng, Jordan 2002] • Which outputs the affinity of documents with each topic. • By averaging over candidate’s document we determined the affinity of a candidate with each topic. • By averaging over candidates we determined the affinity of a party with each topic. • “Topic” is defined as a distribution over the terms in the corpus (how related is each term to the topic).

Topics Topic terms Affinity Difference tax, jobs, spending, [O]bama, stimulus -0. 047618 health, care,

Topics Topic terms Affinity Difference tax, jobs, spending, [O]bama, stimulus -0. 047618 health, care, bill, house, reform -0. 032136 tcot, barney, teaparty, [Sean Bielat], twisters -0. 020878 live, show, interview, radio, fox -0. 018375 posted, facebook, photos, video, check -0. 014608 ff, great, followfriday, twitter, followers -0. 012113 obama, people, dont, good, government -0. 010277 great, county, meeting, day, tonight -0. 007769 campaign, tcot, twitter, facebook, support -0. 007624 john, david, ad, [P]elosi, [Sharron A]ngle -0. 002737 vote, endorsement, [H]armer, ca 10, candidate -0. 001998 change, view, changed, committee, energy 0. 002625 great, day, parade, good, time 0. 002746 ar 02, ar 2, [T]im [Griffin], vote, join 0. 003104 [O]bama, oil, president, hearing, bp 0. 007417 day, happy, great, women, honor 0. 018366 vote, day, early, election, voting 0. 022653 bill, house, voted, senate, reform 0. 028481

Predicting Election Results - Individual Variables Variable Estimate same_party incumbent indegree closeness_all kl-corpus pagerank

Predicting Election Results - Individual Variables Variable Estimate same_party incumbent indegree closeness_all kl-corpus pagerank closeness_in authority republican teaparty retweets hashtags tweets replies closeness_out outdegree kl-party 2. 67 3. 163 0. 252 486. 7 -0. 281 486. 7 1017. 2 0. 442 0. 976 -0. 277 -0. 00113 -0. 00016 -0. 00022 -0. 00026 -20. 9682 0. 023 -0. 047 Prob(>|z|) <0. 0001 <0. 0001 0. 38 0. 15 0. 11 0. 08 0. 1 <0. 05 Accuracy 78. 9% 76. 9% 74. 6% 73. 5% 66. 7% 66. 4% 64. 7% 63. 8% 61. 0% 58. 4% 58. 1% 57. 8% 57. 5% 55. 9%

Mixing It Models Comparison Name All but kl-corpus No content No graph & content

Mixing It Models Comparison Name All but kl-corpus No content No graph & content Variables tweets, kl-corpus, incumbent, party, closeness_all, closeness_out, same_party tweets, corpus, incumbent, same_party, closeness_all, closeness_out incumbent, party, same_party, closeness_all, closeness_out tweets, kl-corpus, incumbent, party, same_party Accuracy 88. 0% 85. 5% 84. 0% 83. 8% 81. 5%