Visualizing Fortune 500 Cos.

Avatar
Posted on Jul 29, 2019


The Fortune 500 is a well-known list that is published by Fortune magazine each year.  The list ranks U.S. based companies in order of revenue for the respective fiscal year. The idea of using revenue as the main criteria essentially shows us which of these companies are the “largest”.  The companies that made the list are from a variety of different industries, such as: insurance, retail, energy, automobiles etc. Aside from just revenue, the Fortune 500 list includes company assets, profits, market value, number of employees, and a few other categories.  While this list may seem to be more “for show” than useful data purposes, there is still a lot of meaning to explore within these numbers. 

Before I get to the analysis it is important to discuss what I have done to scrape the data from the website.  For those who are not familiar with web scraping, essentially, what I have done is create a function that “crawls” through the desired parts of the website that I want to extract and exports the information into a microsoft excel ‘csv’ file.  In this case, I was taking the numbers from the table found in the following url. (https://fortune.com/fortune500/2019/search/) The function also loops through multiple urls in order to get the same set of data from the years 2019, 2018 and 2017.  I was forced to use the python package “selenium” as opposed to the more efficient “scrapy” because of an existing ‘next’ button on the website, which is clickable but does not change the url.  I was able to click the next button through selenium to get to the next 100 rows for each year. Each number in the table is recognized by selenium by using a unique xpath and by iterating through the rows and then iterating through separate chunks of numbers in each row because of a certain pattern among the xpaths of each column.  After the csv files were created, I used pandas to convert some of the columns to integers and floats, in order for the graphs to work.   

Here is the link to the code https://github.com/jdsipala/seleniumProj

Being that the main goal of this list is to compare the size of these different companies, I decided to take a deeper look into the aspect of growth, and how it relates to other aspects of business.  Naturally, I compared revenue % change with profit, to see if there was a correlation between the expansion of a business and the amount of profit for that particular business.  Also, using plotly allowed my graph to have somewhat of an interactive aspect where you could identify which companies are plotted where if you scrolled over their corresponding point on the graph.  It was interesting to see that the majority of businesses were expanding and profiting simultaneously.  This graph also serves the purpose of being able to identify what sort of direction the company is heading in, in regards to expansion and profitability.

I also looked at the effect that the number of employees has on profitability and revenue.  Based on these graphs, there is a more direct relationship between number of employees and revenue as compared to profit.  Also, these graphs are capable of conveying which companies are making the most out of their number of employees on a per employee basis, if you look at the direction of the point as it corresponds to the intersection of the x and y axis.

The next topic that I covered is market value.  For those who are interested in investing, this is a key figure to look at.  Market value has a lot to do with total revenues and many other measurements of a companies’ finances.  There is also the important factor of perception.  While profits and revenues might be similar for two companies, the market value can be drastically different based on the company’s outlook or the outlook of the industry that the company is in.  The following graphs show the relationship between revenue and market value as well as assets and market value. 

The final part of my analysis was to look at some of the categories of the information to see if there are any aggregate trends over time.  Using a bar graph, I was able to compare the average amount of revenues and assets across each of the years. 

                 Assets By Year                           Revenues by Year

 

 

 

 

 

 

 

 

 

                                                                          

About Author

Leave a Comment

No comments found.

View Posts by Categories


Our Recent Popular Posts


View Posts by Tags

#python #trainwithnycdsa 2019 airbnb Alex Baransky alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus API Application artist aws beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep Bundles California Cancer Research capstone Career Career Day citibike clustering Coding Course Demo Course Report D3.js data Data Analyst data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization Deep Learning Demo Day Discount dplyr employer networking feature engineering Finance Financial Data Science Flask gbm Get Hired ggplot2 googleVis Hadoop higgs boson Hiring hiring partner events Hiring Partners Industry Experts Instructor Blog Instructor Interview Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter lasso regression Lead Data Scienctist Lead Data Scientist leaflet linear regression Logistic Regression machine learning Maps matplotlib Medical Research Meet the team meetup Networking neural network Neural networks New Courses nlp NYC NYC Data Science nyc data science academy NYC Open Data NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time Portfolio Development prediction Prework Programming PwC python python machine learning python scrapy python web scraping python webscraping Python Workshop R R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn Selenium sentiment analysis Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau team TensorFlow Testimonial tf-idf Top Data Science Bootcamp twitter visualization web scraping Weekend Course What to expect word cloud word2vec XGBoost yelp