NBA 2020 Data Statistics - Custom Fantasy Values
The skills I demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.
Motivation
I have been playing NBA fantasy basketball for almost 2 decades. Each year a group of my friends, my friends' friends and I join a league or a number of leagues and act as the general manager for our own teams. At the start of each fantasy league we would have an online draft and pick our players for our teams. We play a format called Head to Head where you basically compare your team's accumulated data statistics vs your opponent's team's statistics in 9 different categories. You win the weekly match up if you win in 5/9 or more categories.
Based on this format, team owners strategically build their team to be stronger on some categories. It really depends on the owner's strategy. During a fantasy season, we constantly look at the average statistics of players around the league. We would constantly look for trades with opposing teams or we would look for players that are available as free agents.
As the years go by, I've created my own cheat sheet and had my own calculations to see where my team is at compared to everyone else. I would download my players' statistics weekly to update my cheat sheet. Now thru web scraping I can automate that process and stick the values in my cheat sheet automatically.
Dataset
The dataset that I used was scraped from basketball-reference website. The website contains the list of all the statistics of all 514 NBA players who are actively playing in the 2020 season. The data includes more than all the categories that I use in my fantasy leagues.
Data Calculations
Extra fields were added to the dataset namely an A-Score and Efficiency. Efficiency is calculated as (PTS + REB + AST + STL + BLK โ Missed FG โ Missed FT - TO) / GP but since the data is already in the form of averages per game it was converted to (PTS + REB + AST + STL + BLK โ Missed FG โ Missed FT - TO).
A-Score is calculated by putting weights on the each statistical category. The most important stats to me are steals, blocks, and 3 pointers made. I also penalize turnovers in my calculation so all 4 statistical categories had a weight of 2. Assist and rebounds are harder to get that points so I put a weight of 1.5 for both. The rest only have a weight of 1. This weighting system can be changed based on the team owner's preferences.
Scraping Data Result and Cheatsheet
Here's a screenshot of the scraped data:
Categories
From the main dataset I reduced it to capture only the statistical categories used in our fantasy league. The four additional columns that are derived from the reduced statistical categories that were added are: fg_perc (field goal percentage, ft_perc (free throw percentage), efficiency, and a_score. The reduced dataset is exported into excel into 3 tabs. The first one "Full Ratings" have the alphabetical list of the dataset.
There are two more tabs in the reduced file namely "Efficiency Top 25" and "A Score Top 25". These two tabs are sorted by the Efficiency and A_Score columns respectively.
Efficiency Top 25
A Score Top 25

The full ratings view gives a team owner a view of everyone's statistics. The goal of the "Efficiency" calculation is to see which players are most likely to be rated higher based on the standard efficiency calculation. This sheet may be more useful in "roto" leagues. The goal of the "A Score" is to provide a team owner a weighted score based on his/her preferred build of the team. In this example, I weighed steals, blocks, and 3 pointers made the heaviest. I also penalized turnovers the most.
Analysis
Teams who find the best undervalued draft picks, trades, and free agent pick ups have a greater chance of winning in fantasy basketball. As you can see on both lists, the top 6 players are pretty much rated the same. After the top 6 ratings, the list starts to be different. For my preference, I would rank Damian Lillard over Nikola Jokic and Hassan Whiteside.
Another good example would be Nikola Vucevic, I rank him higher than his efficiency ranking. As I keep going down the two lists, the rankings would have more differences. This would help me value players more according to my preference. I could use the list for the draft, for picking up free agents during the season, and to offer trades to other teams. By looking at other team's players, I can adjust the A Score for their teams and see who they might value more. By doing so, I can offer trades that would benefit both teams.
Future Improvements
- Get the list of players of each team owner. I can mark which players are already taken so the list can be ran for available pickups thru the course of the season.
- Automate the process and schedule it to run daily.
- Create different datasets based on the past week, past two weeks, and one month to see patterns of players who are heating up and should be considered.
- Create an algorithm to suggest possible trades and pickups to benefit my team.