NBA 2020 Statistics - Efficiency and Custom Fantasy Values

Avatar
Posted on Jul 12, 2020

Motivation

I have been playing NBA fantasy basketball for almost 2 decades. Each year a group of my friends, my friends' friends and I join a league or a number of leagues and act as the general manager for our own teams.  At the start of each fantasy league we would have an online draft and pick our players for our teams. We play a format called Head to Head where you basically compare your team's accumulated statistics vs your opponent's team's statistics in 9 different categories. You win the weekly match up if you win in 5/9 or more categories. Based on this format, team owners strategically build their team to be stronger on some categories. It really depends on the owner's strategy. During a fantasy season, we constantly look at the average statistics of players around the league. We would constantly look for trades with opposing teams or we would look for players that are available as free agents. As the years go by, I've created my own cheat sheet and had my own calculations to see where my team is at compared to everyone else. I would download my players' statistics weekly to update my cheat sheet. Now thru web scraping I can automate that process and stick the values in my cheat sheet automatically. 

 

Dataset

The dataset that I used was scraped from basketball-reference website. The website contains the list of all the statistics of all 514 NBA players who are actively playing in the 2020 season. The data includes more than all the categories that I use in my fantasy leagues. 

 

Calculations

Extra fields were added to the dataset namely an A-Score and Efficiency. Efficiency is calculated as (PTS + REB + AST + STL + BLK − Missed FG − Missed FT - TO) / GP but since the data is already in the form of averages per game it was converted to (PTS + REB + AST + STL + BLK − Missed FG − Missed FT - TO). A-Score is calculated by putting weights on the each statistical category. The most important stats to me are steals, blocks, and 3 pointers made. I also penalize turnovers in my calculation so all 4 statistical categories had a weight of 2. Assist and rebounds are harder to get that points so I put a weight of 1.5 for both. The rest only have a weight of 1.  This weighting system can be changed based on the team owner's preferences.

Scraping Result and Cheatsheet

Here's a screenshot of the scraped data: 

From the main dataset I reduced it to capture only the statistical categories used in our fantasy league. The four additional columns that are derived from the reduced statistical categories that were added are: fg_perc  (field goal percentage, ft_perc (free throw percentage), efficiency, and a_score. The reduced dataset is exported into excel into 3 tabs. The first one "Full Ratings" have the alphabetical list of the dataset.

There are two more tabs in the reduced file namely "Efficiency Top 25" and "A Score Top 25". These two tabs are sorted by the Efficiency and A_Score columns respectively. 

Efficiency Top 25

A Score Top 25

The full ratings view gives a team owner a view of everyone's statistics. The goal of the "Efficiency" calculation is to see which players are most likely to be rated higher based on the standard efficiency calculation. This sheet may be more useful in "roto" leagues. The goal of the "A Score" is to provide a team owner a weighted score based on his/her preferred build of the team. In this example, I weighed steals, blocks, and 3 pointers made the heaviest. I also penalized turnovers the most. 

Analysis

Teams who find the best undervalued draft picks, trades, and free agent pick ups have a greater chance of winning in fantasy basketball. As you can see on both lists, the top 6 players are pretty much rated the same. After the top 6 ratings, the list starts to be different. For my preference, I would rank Damian Lillard over Nikola Jokic and Hassan Whiteside. Another good example would be Nikola Vucevic, I rank him higher than his efficiency ranking. As I keep going down the two lists, the rankings would have more differences. This would help me value players more according to my preference. I could use the list for the draft, for picking up free agents during the season, and to offer trades to other teams. By looking at other team's players, I can adjust the A Score for their teams and see who they might value more. By doing so, I can offer trades that would benefit both teams.


Future Improvements

  1.  Get the list of players of each team owner. I can mark which players are already taken so the list can be ran for available pickups thru the course of the season.
  2. Automate the process and schedule it to run daily. 
  3. Create different datasets based on the past week, past two weeks, and one month to see patterns of players who are heating up and should be considered.
  4. Create an algorithm to suggest possible trades and pickups to benefit my team.

 

About Author

Leave a Comment

No comments found.

View Posts by Categories


Our Recent Popular Posts


View Posts by Tags

#python #trainwithnycdsa 2019 airbnb Alex Baransky alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus API Application artist aws beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep Bundles California Cancer Research capstone Career Career Day citibike clustering Coding Course Demo Course Report D3.js data Data Analyst data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization Deep Learning Demo Day Discount dplyr employer networking feature engineering Finance Financial Data Science Flask gbm Get Hired ggplot2 googleVis Hadoop higgs boson Hiring hiring partner events Hiring Partners Industry Experts Instructor Blog Instructor Interview Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter lasso regression Lead Data Scienctist Lead Data Scientist leaflet linear regression Logistic Regression machine learning Maps matplotlib Medical Research Meet the team meetup Networking neural network Neural networks New Courses nlp NYC NYC Data Science nyc data science academy NYC Open Data NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time Portfolio Development prediction Prework Programming PwC python python machine learning python scrapy python web scraping python webscraping Python Workshop R R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn Selenium sentiment analysis Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau team TensorFlow Testimonial tf-idf Top Data Science Bootcamp twitter visualization web scraping Weekend Course What to expect word cloud word2vec XGBoost yelp