Using Data to Analyze Stock Sentiment

Posted on Jan 24, 2021

In today's day and age a wealth of financial information is available to every investor with the help of a single click. Knowing and acting on that information is what sets a good investor apart. We will try to gain an edge in a business where it's very hard to find one with the help of data science.

For this project we will be scraping, a website dedicated toΒ  stock information and news. Finviz uses trusted websites that provide reliable news that uses consistent jargon. Consistency in textual patterns is important because it improves our sentiment analysis. We will be analyzing TSLA, the ticker for the famous electric car company Tesla.Β 

The goal is to generate investing insight by applying sentiment analysis on financial news headlines. With the help of this natural language processing (NLP) technique, we will try and understand the emotion behind the headlines and predict how the market feels about the particular stock. This would make it possible to make educated guesses on how certain stocks will perform and trade accordingly.


First, we will scrape the news data from Finviz with the help of the BeautifulSoup and requests modules. Beautiful Soup will help us pull particular content from a webpage, remove the HTML markup, and save the information. The code will parse the URL for the HTML table of news and iterate through the list of tickers to gather the recent headlines for the assigned ticker.


Sentiment analysis is extremely sensitive to context. Certain sentences can be misconstrued by the algorithm due to its inability to differentiate the context surrounding a word that was used. As mentioned above, the importance of consistency in jargon is a key component, which is why scrapping headlines specifically by financial journalists is crucial.

VADER (Valence Aware Dictionary for Sentiment Reasoning) is a model within the NLTK (Natural Language Toolkit) module used for text sentiment analysis that is sensitive to both polarity and intensity of emotion. When we analyze the headlines, our focus is only on whether the opinion is positive, negative, or neutral. The sentiment score of a text is obtained by summing up the intensity of each word in the text.

Once we have the scores, we can start plotting the results with the time series of the TSLA stock. Tesla is considered a volatile stock and thanks to its eccentric CEO, is regularly in the news. We can see the week start with negative sentiment but reverse to positive towards the end. The scores coincides with the stock price, which opened at a low of $837 on 1/19/21 and rose as high as $854 on 1/21/21.

TSLA Sentiment

Focusing on a single trading day allows us to see how the sentiment fluctuates throughout the day. For this specific trading day, the sentiment was fairly neutral.

TSLA Sentiment on January 19, 2021


Generating investing insight by applying sentiment analysis is one of many ways data science is being used in the financial world. As we saw above, gaining the slightest edge in information can help us make sounder investment decisions.

The skills I demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.


About Author

Leave a Comment

No comments found.

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI