Twitter Analysis of Presidential Candidates 2016

Posted on Aug 7, 2016


The 2016 Presidential Election is fast approaching and President Barack Obama’s second term is about to come to an end. Both the Republican and Democratic parties have nominated their candidate for president, which sees Donald Trump battling Hillary Clinton. Public sentiment plays an important role in influencing who becomes the future leader of the United States. Both candidates have strong followers on Twitter with Clinton having 9.21 million to Trump’s 11.9 million followers. For this Shiny project, I developed an application which analyzes tweets in real time made by the Presidential candidates themselves, and tweets by general public directed at the candidates, to gain some insight. This application provides information about the sentiment and frequently used words in the tweets.  Sentiment towards the candidates fluctuates quickly as interviews, debates, responses to global events, and other issues occur.

This application is divided into 2 sections. The first part focuses on the tweets made by the candidates while the second section focuses on tweets directed at the candidates. The app allows a user to search the tweets based on dates and the number of tweets the user wants to view. It will analyze the sentiments of the searched tweets based on the NRC sentiment which is explained later. It also displays the commonly used words by the candidates and the general public in a Word Cloud, and shows the number of tweets posted by the candidates on particular calendar date. Similarly, this app gives information about users who tweeted the most about the candidates. On September 26th 2016, the first Presidential Debate was held at Hofstra University in New York. This app analyzed the tweets and the reaction of the people after the debate.


 Fig: Table displaying all the retrieved tweets

Section 1: Tweets made by Presidential Candidate (Trump vs Clinton)

Both candidates have a twitter account with Trump’s handle being “realDonaldTrump” and Clinton’s being “HillaryClinton”. The maximum number of recent tweets the application was able to access was only 1084 for Trump and 158 for Clinton due to the limits of Twitter’s API. Therefore, to have a fair analysis of the tweets, 150 recent tweets from both candidates were analyzed. They were gathered on September 28, 2016, two days after the First Presidential Debate was held.

1.1 Word Cloud

A Word Cloud is a powerful text mining method that highlights the most frequently used keywords in a paragraph of texts. In our case, it highlights the most frequently used words in the tweets. The commonly used keywords stand out better in a word cloud. The tweet texts are loaded using the Corpus function, and they then needs to be cleaned and transformed. Only the top 200 most frequent words are displayed, and each word must occurs at least three times.

t_tweet  h_tweets

Fig: Word Cloud for Donald Trump (left) and Hillary Clinton (right)

Comparing the two word clouds, we can see the most commonly used words by each candidate. The higher the frequency of the word, the large it will appears in the word cloud.

1.2 Sentiment Analysis

For sentiment analysis, I used the “get_nrc_sentiment” function from the Syuzhet package in R. This function implements the NRC Emotion Lexicon which was developed by Dr. Saif Mohammad and his team. The NRC Emotion Lexicon is very popular and has been widely used for sentiment analysis. It consists of list of words which have been associated with eight emotions which are "anger", "anticipation", "disgust", "fear", "joy", "sadness", "surprise" and "trust", with additional two sentiments “negative” and “positive”.


Fig: Sentimental Analysis of Tweets of Donald Trump


Fig: Sentimental Analysis of Tweets of Hillary Clinton

Analyzing the tweets of the candidates, we can see a higher frequency of positive sentiments is common to both. However, the frequency of negative for Trump is much higher than Clinton’s. This suggests that tweets made by Donald Trump were more negative than those from Hillary Clinton.

1.3 Tweet Calendar

The tweet calendar feature gives information about the number of retrieved tweets that were posted on particular day of the calendar. The different shades of blue color shows the number of tweets that were posted on that particular day with darker shades depicting higher numbers of tweets posted. In the figure below, we can see that both candidates tweeted the most a day after the first presidential debate which was held on September 26th.


Fig: Tweets Calendar of Donald Trump


Fig: Tweets Calendar of Hillary Clinton

Section 2: People’s tweets about Presidential candidate (Trump vs Clinton)

In this section, we are analyzing the tweets made by the general public towards the Presidential Candidates. Donald Trump and Hillary Clinton faced off for the first time in the first Presidential Debate, clashing over policies and attacking each other on the issues. The tweets were analyzed after the debate to see how people reacted towards each candidate. For the analysis, more than 3000 tweets mentioning Trump and Clinton were taken into consideration.

2.1 Word Cloud

Looking at both word clouds, we can see that “debate” was most frequently used words since it was the first debate between Trump and Clinton. Similarly other commonly used words are also highlighted in the word clouds that are bigger in size and have colors other than the dark green.

ppl_trump ppl_wc_hillary

Fig: Word Cloud for Donald Trump (left) and Hillary Clinton (right)

2.2 Sentiment Analysis

Looking at the sentiment analysis plot, it was surprising to see that people had more negative than positive reactions towards the candidates. Hillary Clinton and Donald Trump are both historically unpopular, but large numbers of Americans who can't stand them will likely vote for one of them anyway by choosing what they consider to be the lesser of two evils. In this light, the sentiment analysis plot is simply a reflection of their low popularity numbers among the left, right and neutral groups.


Fig: Sentimental Analysis of Tweets of Donald Trump



Fig: Sentimental Analysis of Tweets of Hillary Clintons

2.3 Users who tweets most about the candidates

The application also displays the top ten users who tweeted most often about the presidential candidates in the retrieved tweets.


Fig: Users who tweeted most about Donald Trump


Fig: Users who tweeted most about Hillary Clinton


Twitter has played an important role in this election campaign. It has given a platform, not only for the presidential candidates, but also for the people to express their views. It is the new form of personal marketing which has allowed the candidates to engage with followers and make emotional connections with the voters – especially younger audiences. The app revealed that the tweets’ sentiments changed with time. Hillary Clinton consistently seemed to be more positive in her tweets than Donald Trump. After analyzing tweets from the general public following the first Presidential Debate, it was surprising to see that the sentiment toward both Hillary Clinton and Donald Trump were more negative than positive. However, sentiment tends to fluctuate quickly as the candidates move along the road to November fourth. This app is still under construction and more features will be added in the future. Stay tuned.

Github Links:



About Author

Samriddhi Shakya

Samriddhi comes from a Remote Sensing and Geographic Information Systems (GIS) background. He has a Master’s degree in Geography from Auburn University and Bachelors of Engineering degree in Geomatics from Kathmandu University. During his Masters at Auburn University,...
View all posts by Samriddhi Shakya >

Leave a Comment

No comments found.

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI