Using Data to Analyze Stock Sentiment
In today's day and age a wealth of financial information is available to every investor with the help of a single click. Knowing and acting on that information is what sets a good investor apart. We will try to gain an edge in a business where it's very hard to find one with the help of data science.
For this project we will be scraping finviz.com, a website dedicated to stock information and news. Finviz uses trusted websites that provide reliable news that uses consistent jargon. Consistency in textual patterns is important because it improves our sentiment analysis. We will be analyzing TSLA, the ticker for the famous electric car company Tesla.
The goal is to generate investing insight by applying sentiment analysis on financial news headlines. With the help of this natural language processing (NLP) technique, we will try and understand the emotion behind the headlines and predict how the market feels about the particular stock. This would make it possible to make educated guesses on how certain stocks will perform and trade accordingly.
First, we will scrape the news data from Finviz with the help of the BeautifulSoup and requests modules. Beautiful Soup will help us pull particular content from a webpage, remove the HTML markup, and save the information. The code will parse the URL for the HTML table of news and iterate through the list of tickers to gather the recent headlines for the assigned ticker.
Sentiment analysis is extremely sensitive to context. Certain sentences can be misconstrued by the algorithm due to its inability to differentiate the context surrounding a word that was used. As mentioned above, the importance of consistency in jargon is a key component, which is why scrapping headlines specifically by financial journalists is crucial.
VADER (Valence Aware Dictionary for Sentiment Reasoning) is a model within the NLTK (Natural Language Toolkit) module used for text sentiment analysis that is sensitive to both polarity and intensity of emotion. When we analyze the headlines, our focus is only on whether the opinion is positive, negative, or neutral. The sentiment score of a text is obtained by summing up the intensity of each word in the text.
Once we have the scores, we can start plotting the results with the time series of the TSLA stock. Tesla is considered a volatile stock and thanks to its eccentric CEO, is regularly in the news. We can see the week start with negative sentiment but reverse to positive towards the end. The scores coincides with the stock price, which opened at a low of $837 on 1/19/21 and rose as high as $854 on 1/21/21.
Focusing on a single trading day allows us to see how the sentiment fluctuates throughout the day. For this specific trading day, the sentiment was fairly neutral.
Generating investing insight by applying sentiment analysis is one of many ways data science is being used in the financial world. As we saw above, gaining the slightest edge in information can help us make sounder investment decisions.
The skills I demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.