Tennis: Grand Slam Prizes Over the Years Data

Posted on Sep 9, 2019
The skills I demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.


According to Tennis Industry Association, there are 17.9 million players playing the sport of tennis in the US and an additional 14 million Americans, while not current players, express anΒ  interest in the sport. This phenomenon of growing participation and interest in the sport not just limited to Americans, but worldwide. The number one crowdfunding site for sports, PledgeSports, has listed Tennis at Number 6 as the most popular sport in the world. The continued participation and interest naturally leads to following of the professional sport, and more specifically, the main events, namely, the Grand Slams. At the writing of this post, the top trending stories in sports is:Β 

US Open prize money revealed with biggest prize in Grand Slam tennis on offer

To that end,Β  I build an interactive webapp that focuses on the prize money awarded in singles events across Grand Slams.

Background Information

As an avid player of the sport, I was interested in knowing what the numbers are when athletes compete for the biggest prize and how has that changed over the years. In addition, in recent, a topic of discussion in sports has revolved around equal pay. It is of interest to see how the prizes have evolved across the Men and Women events of the sport.

This webapp was developed in R Shiny, an R package that makes it easy to visualize analytical findings and allow interactive use of the dashboard. The code can be found in this github repo. This dataset was acquired from and it consists of men's and women's singles winners' prize money for each Grand Slam from 1968 to 2015. For the Australian Open, data is provided since 1971; two tournaments were held in 1977, but no women's prize money data is available for that year; and no tournament was held in 1986. Prize money is noted in local denominations: US Open (USD), Australian Open (AUD), Roland Garros (FRF through 2001, EUR since 2002), Wimbeldon (GBP).

The top row on the dashboard is a composition of info boxes that provide static information of some important stats from the dataset including the value of the biggest prize money ($3.3 million).

The main visual in the dashboard is an interactive chart with controls that show and contrast the prize money across the grand slams and for both men and women events. There has been a steady increase in the prize money awards over the years. However, the last 10 years have yielded the greatest increase with the latest numbers showing prize money of $2 million for Wimbledon and French Open, $3.3 million for US Open and $2.13 million for Australian Open. With regards to equal pay, all events beginning 2007 have yielded the same prize money award across the Men and Women events.

Future Work

In order to truly find and contrast the prize money across grand slams over the years, it would be useful to have a historical currency exchange rates that could provide a uniform reporting value for each award. Since the current awards are reported in local currencies, additional data is needed for consistent analysis. Moreover, a correlation can be made with award winners to determine which athletes have topped the list of earning in grand slam awards.

About Author

Muhammad Ihsanulhaq Sarfraz

Ihsan is an NYC Data Science Academy Fellow currently pursuing his PhD in Computer Engineering from Purdue University with a dissertation on analyzing patterns of learner behaviors in MOOCs. He has a passion for building dashboards and interfaces...
View all posts by Muhammad Ihsanulhaq Sarfraz >

Leave a Comment

No comments found.

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI