NBA 2020 Data Statistics - Custom Fantasy Values

Posted on Jul 12, 2020
The skills I demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.

Motivation

I have been playing NBA fantasy basketball for almost 2 decades. Each year a group of my friends, my friends' friends and I join a league or a number of leagues and act as the general manager for our own teams.  At the start of each fantasy league we would have an online draft and pick our players for our teams. We play a format called Head to Head where you basically compare your team's accumulated data statistics vs your opponent's team's statistics in 9 different categories. You win the weekly match up if you win in 5/9 or more categories.

Based on this format, team owners strategically build their team to be stronger on some categories. It really depends on the owner's strategy. During a fantasy season, we constantly look at the average statistics of players around the league. We would constantly look for trades with opposing teams or we would look for players that are available as free agents.

As the years go by, I've created my own cheat sheet and had my own calculations to see where my team is at compared to everyone else. I would download my players' statistics weekly to update my cheat sheet. Now thru web scraping I can automate that process and stick the values in my cheat sheet automatically. 

 

Dataset

The dataset that I used was scraped from basketball-reference website. The website contains the list of all the statistics of all 514 NBA players who are actively playing in the 2020 season. The data includes more than all the categories that I use in my fantasy leagues. 

 

Data Calculations

Extra fields were added to the dataset namely an A-Score and Efficiency. Efficiency is calculated as (PTS + REB + AST + STL + BLK − Missed FG − Missed FT - TO) / GP but since the data is already in the form of averages per game it was converted to (PTS + REB + AST + STL + BLK − Missed FG − Missed FT - TO).

A-Score is calculated by putting weights on the each statistical category. The most important stats to me are steals, blocks, and 3 pointers made. I also penalize turnovers in my calculation so all 4 statistical categories had a weight of 2. Assist and rebounds are harder to get that points so I put a weight of 1.5 for both. The rest only have a weight of 1.  This weighting system can be changed based on the team owner's preferences.

Scraping Data Result and Cheatsheet

Here's a screenshot of the scraped data: 

Categories

NBA 2020 Data Statistics - Custom Fantasy Values

From the main dataset I reduced it to capture only the statistical categories used in our fantasy league. The four additional columns that are derived from the reduced statistical categories that were added are: fg_perc  (field goal percentage, ft_perc (free throw percentage), efficiency, and a_score. The reduced dataset is exported into excel into 3 tabs. The first one "Full Ratings" have the alphabetical list of the dataset.

NBA 2020 Data Statistics - Custom Fantasy Values

There are two more tabs in the reduced file namely "Efficiency Top 25" and "A Score Top 25". These two tabs are sorted by the Efficiency and A_Score columns respectively. 

Efficiency Top 25NBA 2020 Data Statistics - Custom Fantasy Values

A Score Top 25

The full ratings view gives a team owner a view of everyone's statistics. The goal of the "Efficiency" calculation is to see which players are most likely to be rated higher based on the standard efficiency calculation. This sheet may be more useful in "roto" leagues. The goal of the "A Score" is to provide a team owner a weighted score based on his/her preferred build of the team. In this example, I weighed steals, blocks, and 3 pointers made the heaviest. I also penalized turnovers the most. 

Analysis

Teams who find the best undervalued draft picks, trades, and free agent pick ups have a greater chance of winning in fantasy basketball. As you can see on both lists, the top 6 players are pretty much rated the same. After the top 6 ratings, the list starts to be different. For my preference, I would rank Damian Lillard over Nikola Jokic and Hassan Whiteside.

Another good example would be Nikola Vucevic, I rank him higher than his efficiency ranking. As I keep going down the two lists, the rankings would have more differences. This would help me value players more according to my preference. I could use the list for the draft, for picking up free agents during the season, and to offer trades to other teams. By looking at other team's players, I can adjust the A Score for their teams and see who they might value more. By doing so, I can offer trades that would benefit both teams.


Future Improvements

  1.  Get the list of players of each team owner. I can mark which players are already taken so the list can be ran for available pickups thru the course of the season.
  2. Automate the process and schedule it to run daily. 
  3. Create different datasets based on the past week, past two weeks, and one month to see patterns of players who are heating up and should be considered.
  4. Create an algorithm to suggest possible trades and pickups to benefit my team.

 

About Author

Leave a Comment

No comments found.

View Posts by Categories


Our Recent Popular Posts


View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI