Twitter Analysis of Presidential Candidates 2016
Introduction
The 2016 Presidential Election is fast approaching and President Barack Obama’s second term is about to come to an end. Both the Republican and Democratic parties have nominated their candidate for president, which sees Donald Trump battling Hillary Clinton. Public sentiment plays an important role in influencing who becomes the future leader of the United States. Both candidates have strong followers on Twitter with Clinton having 9.21 million to Trump’s 11.9 million followers. For this Shiny project, I developed an application which analyzes tweets in real time made by the Presidential candidates themselves, and tweets by general public directed at the candidates, to gain some insight. This application provides information about the sentiment and frequently used words in the tweets. Sentiment towards the candidates fluctuates quickly as interviews, debates, responses to global events, and other issues occur.
This application is divided into 2 sections. The first part focuses on the tweets made by the candidates while the second section focuses on tweets directed at the candidates. The app allows a user to search the tweets based on dates and the number of tweets the user wants to view. It will analyze the sentiments of the searched tweets based on the NRC sentiment which is explained later. It also displays the commonly used words by the candidates and the general public in a Word Cloud, and shows the number of tweets posted by the candidates on particular calendar date. Similarly, this app gives information about users who tweeted the most about the candidates. On September 26th 2016, the first Presidential Debate was held at Hofstra University in New York. This app analyzed the tweets and the reaction of the people after the debate.
Fig: Table displaying all the retrieved tweets
Section 1: Tweets made by Presidential Candidate (Trump vs Clinton)
Both candidates have a twitter account with Trump’s handle being “realDonaldTrump” and Clinton’s being “HillaryClinton”. The maximum number of recent tweets the application was able to access was only 1084 for Trump and 158 for Clinton due to the limits of Twitter’s API. Therefore, to have a fair analysis of the tweets, 150 recent tweets from both candidates were analyzed. They were gathered on September 28, 2016, two days after the First Presidential Debate was held.
1.1 Word Cloud
A Word Cloud is a powerful text mining method that highlights the most frequently used keywords in a paragraph of texts. In our case, it highlights the most frequently used words in the tweets. The commonly used keywords stand out better in a word cloud. The tweet texts are loaded using the Corpus function, and they then needs to be cleaned and transformed. Only the top 200 most frequent words are displayed, and each word must occurs at least three times.
Fig: Word Cloud for Donald Trump (left) and Hillary Clinton (right)
Comparing the two word clouds, we can see the most commonly used words by each candidate. The higher the frequency of the word, the large it will appears in the word cloud.
1.2 Sentiment Analysis
For sentiment analysis, I used the “get_nrc_sentiment” function from the Syuzhet package in R. This function implements the NRC Emotion Lexicon which was developed by Dr. Saif Mohammad and his team. The NRC Emotion Lexicon is very popular and has been widely used for sentiment analysis. It consists of list of words which have been associated with eight emotions which are "anger", "anticipation", "disgust", "fear", "joy", "sadness", "surprise" and "trust", with additional two sentiments “negative” and “positive”.
Fig: Sentimental Analysis of Tweets of Donald Trump
Fig: Sentimental Analysis of Tweets of Hillary Clinton
Analyzing the tweets of the candidates, we can see a higher frequency of positive sentiments is common to both. However, the frequency of negative for Trump is much higher than Clinton’s. This suggests that tweets made by Donald Trump were more negative than those from Hillary Clinton.
1.3 Tweet Calendar
The tweet calendar feature gives information about the number of retrieved tweets that were posted on particular day of the calendar. The different shades of blue color shows the number of tweets that were posted on that particular day with darker shades depicting higher numbers of tweets posted. In the figure below, we can see that both candidates tweeted the most a day after the first presidential debate which was held on September 26th.
Fig: Tweets Calendar of Donald Trump
Fig: Tweets Calendar of Hillary Clinton
Section 2: People’s tweets about Presidential candidate (Trump vs Clinton)
In this section, we are analyzing the tweets made by the general public towards the Presidential Candidates. Donald Trump and Hillary Clinton faced off for the first time in the first Presidential Debate, clashing over policies and attacking each other on the issues. The tweets were analyzed after the debate to see how people reacted towards each candidate. For the analysis, more than 3000 tweets mentioning Trump and Clinton were taken into consideration.
2.1 Word Cloud
Looking at both word clouds, we can see that “debate” was most frequently used words since it was the first debate between Trump and Clinton. Similarly other commonly used words are also highlighted in the word clouds that are bigger in size and have colors other than the dark green.
Fig: Word Cloud for Donald Trump (left) and Hillary Clinton (right)
2.2 Sentiment Analysis
Looking at the sentiment analysis plot, it was surprising to see that people had more negative than positive reactions towards the candidates. Hillary Clinton and Donald Trump are both historically unpopular, but large numbers of Americans who can't stand them will likely vote for one of them anyway by choosing what they consider to be the lesser of two evils. In this light, the sentiment analysis plot is simply a reflection of their low popularity numbers among the left, right and neutral groups.
Fig: Sentimental Analysis of Tweets of Donald Trump
Fig: Sentimental Analysis of Tweets of Hillary Clintons
2.3 Users who tweets most about the candidates
The application also displays the top ten users who tweeted most often about the presidential candidates in the retrieved tweets.
Fig: Users who tweeted most about Donald Trump
Fig: Users who tweeted most about Hillary Clinton
Conclusion
Twitter has played an important role in this election campaign. It has given a platform, not only for the presidential candidates, but also for the people to express their views. It is the new form of personal marketing which has allowed the candidates to engage with followers and make emotional connections with the voters – especially younger audiences. The app revealed that the tweets’ sentiments changed with time. Hillary Clinton consistently seemed to be more positive in her tweets than Donald Trump. After analyzing tweets from the general public following the first Presidential Debate, it was surprising to see that the sentiment toward both Hillary Clinton and Donald Trump were more negative than positive. However, sentiment tends to fluctuate quickly as the candidates move along the road to November fourth. This app is still under construction and more features will be added in the future. Stay tuned.
Github Links: https://github.com/sam648/TwitterAnalysis