Tennis Player Performance Analysis

Posted on Feb 10, 2020

In every sport, people want to know who is the best, usually to confirm that their favorite players and teams are among them. The question is how can we determine that. My ShinyTennis application examines many statistics over decades of recorded tennis matches to decide who the dominant players are. This application uses the Tennis Match Charting Project Data Frame

Application Explanation

The application has 5 sections, which examine different aspects of player performance, which are sorted from short term to long term analysis.

Point Data

This measures the performance of 2 players by displaying the average number of points per game and the average number of volleys in each point. It is expected that better players will defeat their opponents more quickly, so lower values here are considered better. Andre Agassi had the lowest Points per match and Volley Length Per match.


Match Data

This measures the performance of 2 players in terms of the matches they win. It displays Win Ratios, Tournament Wins, Matches played per year, and a histogram of tournament wins per year. The best players are those that have higher Win Ratios, Match Participation, and Tournament Wins. The Histogram makes the distribution of Tournament Wins easier to understand. Roger Federer and Novak Djokovic had the most tournament wins.


Tournament Performance

This measures which players are doing the best in the selected tournament. It displays the number of tournaments wins and matches played for the top 10 players for that tournament. This allows the user to determine which players are dominating each individual tournament. Pete Sampras, Roger Federer, and Steffi Graf have all won the US Open 5 times, but Roger Federer has the most matches played.


Newbs VS. Veterans Performance

This is similar to Match Data, except it normalizes the years relative to the player's start year. This makes it possible to compare players at the same points in their career. The best players are those who can maintain high win ratios and win rates throughout their career. Serena Williams has the longest Tennis Career at 20 years, and her tournament wins have been at their highest over the last 10 years.


Biggest Rise/Falls Over Time

This measures the derivatives of win ratios and tournament wins in order to see at what point in the chosen player's careers their performance changed the most. This makes it easy to see how players have changed the most as their career progressed. In 2015 Serena Williams had the biggest drop in Win Ratio of all players.


Future Work

If I had more time, I would give the user more control over how the data is presented. Allowing them to sort a player combobox in different ways to make it clear which players have the highest values in each plot. This would make it much easier to see who the dominant players are. I could also search additional data frames to examine who is winning the most prize money, and the nationalities of tournaments and players to see how player performance affects these variables.

About Author

Seth Jackson

Seth Jackson is an expert in logic, economics, and philosophy with over 10 years of experience writing software. After getting a BA in Computer Science, he completed NYCDSA's Data Science program in order to obtain insights from data...
View all posts by Seth Jackson >

Leave a Comment

No comments found.

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 airbnb Alex Baransky alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus API Application artist aws beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep Bundles California Cancer Research capstone Career Career Day citibike clustering Coding Course Demo Course Report D3.js data Data Analyst data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization Deep Learning Demo Day Discount dplyr employer networking feature engineering Finance Financial Data Science Flask gbm Get Hired ggplot2 googleVis Hadoop higgs boson Hiring hiring partner events Hiring Partners Industry Experts Instructor Blog Instructor Interview Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter lasso regression Lead Data Scienctist Lead Data Scientist leaflet linear regression Logistic Regression machine learning Maps matplotlib Medical Research Meet the team meetup Networking neural network Neural networks New Courses nlp NYC NYC Data Science nyc data science academy NYC Open Data NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time Portfolio Development prediction Prework Programming PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn Selenium sentiment analysis Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau team TensorFlow Testimonial tf-idf Top Data Science Bootcamp twitter visualization web scraping Weekend Course What to expect word cloud word2vec XGBoost yelp