NBA Data Exploratory – Is 3-Point Shooting Just a Hype?

Posted on Mar 13, 2017

Introduction

Over the past 3 years or so, because of one team, Golden State Warriors, and one man, Stephen Curry, 3-point shooting has become a buzz word. Headlines everywhere covering how great 3-point shooting is, media hyping the all-star Stephen Curry, and sport analysts talking about how teams should build around threes, and the league is shooting more threes than ever in the NBA history.

Google_trend_three_point_shot

As a data scientist, we would like to discover the truth about rumors and hype. Thus, this leads to some interesting questions to answer:

 

  • Do winning teams take more 3-point shots?
  • Do winning teams take less short ranged shots?
  • Have all the teams shifted to long range shooting strategy?
  • Are there teams getting left behind because they didn’t adapt?
  • Are 3 point shooting as effective as the hype says?
  • How effective comparing 3-point vs 2-point shots?

 

In this data exploratory project, I will answer these questions using facts and insights from data, and I will validate my points using statistics and mathematics.

 

Data

Important notes about the data:

  • Data was obtained from two sources:
    1. The official NBA website, using the advanced stats function (http://stats.nba.com/teams/advanced/)
    2. NBA Miner website, for the shot distance usage data (http://www.nbaminer.com/shot-distances/)
  • Two datasets were joined by matching the β€œSeason” and β€œTeam Name”
  • From season 2004~2005 to season 2015~2016
  • Seasonal average of team data
  • Only regular season data was used. Did not use any playoff data
  • The data table looks like the following:
Season Team GP W Less than 8ft. usage % 8-16 feet usage % 16-24 feet usage % 24+ feet usage % Avg. Shot Dis.(ft.) Offensive Rating
2015-2016 Atlanta Hawks 82 48 41.75 10.71 13.84 33.45 12.76 103
2015-2016 Boston Celtics 82 48 42.73 12.51 15.55 29 12.43 103.9
2015-2016 Brooklyn Nets 82 21 43.68 16.32 18.1 21.68 11.66 100.9
2015-2016 Charlotte Hornets 82 48 36.4 13.29 15.52 34.64 13.73 105.1
2015-2016 Chicago Bulls 82 42 41.5 15.43 18.51 24.11 12.32 102.1
2015-2016 Cleveland Cavaliers 82 57 39.39 12.83 12.52 35.09 13.33 108.1
2015-2016 Dallas Mavericks 82 42 35.79 14.47 15.87 33.6 13.71 104.8

Data Exploration

The teams are divided into:

  • Teams got into playoff
  • Team did not get into playoff

The shot usage % for different distances are potted for each season. Note the shot usage % for all distances should sum up to 100%:

equation_1

seasonal shot usage < 8ft

seasonal shot usage > 24ft

seasonal shot usage 8-16ft

seasonal shot usage 16-24ft

Important features for these 4 graphs are:

  • Shot usage % for distance β€œ< 8ft.” and β€œ16-24ft.”, playoff teams tend to shoot less than non-playoff teams, while for distance β€œ> 24ft.”, playoff teams shoot more.
  • The whole league is taking more 3-point shots since 2012, as can be observed from the upward trend in the β€œ> 24ft.” plot. All teams are sacrificing shots from β€œ16-24 ft.” for the 3-point shots.
  • There is barely any difference for shot distance β€œ8-16 ft.”.

An additional plot below confirms again, teams that do well take longer range shots:

average shot distance

Going back to our questions:

  • Do winning teams take more 3-point shots?
  • Do winning teams take less short ranged shots?
  • Have all the teams shifted to long range shooting strategy?
  • Are there teams getting left behind because they didn’t adapt?

We can answer loud and clear, teams got into playoff definitely take more 3-point shots and less short ranged shots. However, surprisingly, the whole league is adapting the 3-point shooting strategy, not just Golden State Warriors. Now we are down to one last questionΒ to validate:

  • Are 3 point shooting as effective as the hype says?
  • How effective comparing 3-point vs 2-point shots?

Before we can validate, we need to look at one advanced metric, offensive rating:

equation_2

Offensive rating is developed by a statistician, Dean Oliver. It is a rather complex equation, so we will look at a simpler version that is more understandable:

equation_3

With offensive rating, we can measure how effective is 3-point shooting compared to other shot types, thus, offensive rating vs. different shot distance usage are plotted:

0ft_2

8-16ft_2

16-24ft_2

24ft_2

As we can observe from these 4 plots, all shot types except for β€œ> 24ft.” has negative correlation with offensive rating, i.e. the slope is negative. Only β€œ> 24ft.” has a positive slope, thus, we may conclude that the more 3-point shots are taken, the higher offensive rating will be.

However, we may realize the slope for β€œOffensive Rating vs. > 24ft.” is quite small, 0.23, meaning that the β€œ> 24ft.” variable alone may not be a good indicator of how well the teams will do offensively. At this point, we need some new indicator to tell us how well the teams we will do offensively in terms of shot distance. A useful metric would the ratio of shot types. Coaches often tune the team by mixing shooters and drivers (guards that can penetrate defense and get the ball close to the basket), therefore, the ratio between 3-point shot vs. other shooting distance usage may be a good indicator for offensive rating. We came up with 3 new variables here:

1. (3-point shot) / (16-24ft. shot)

2. (3-point shot) / (8-16ft. shot)

3. (3-point shot) / (< 8ft. shot)

Offensive rating vs. these 3 new variables are plotted below:

Β Β 24_to_16-24ft_cleaned

24_to_8-16ft_cleaned

24_to_0ft

We can see the slope for β€œOffensive Rating vs. (3-point shot / <8ft. shot)” has the largest slope of 8.8, meaning for 1 point increase in the (3-point shot / <8ft. shot) ratio, we gain 8 points in offensive rating. If we further run a linear model using the shot distance variables to predict offensive rating, we get the following summary:

p_values_table

Looking at the last column (p-values) of the row with double asterisk mark, the value shows 0.00718, it is telling us:

Because 0.00718 is much smaller than 5% (0.05), the linear model is 95% confident that the (3-point shot / <8ft. shot) ratio is a good indicator of how well the team will do offensively.

Recall our seasonal shot distance plot, teams are sacrificing β€œ16-24ft.” shot for β€œ> 24ft.” shot:

seasonal shot usage 16-24ft seasonal shot usage < 8ft

Our insight tells us, trade-off between 3-points shot and < 8ft. shot might be a better choice for offensive, because for 1 point gain in this ratio, we gain 8.8 points in offensive rating, while other trade-off only gain 2 ~ 3 points in offensive rating. The teams may want to rethink their strategy of which shot types to sacrifice for 3-point shot. Now to answer our hypothesized question:

  • Are 3-point shooting as effective as the hype says?
  • How effective comparing 3-point vs 2-point shots?

Our answer is 3-point shot is the most effective shot, the league has over underestimate its power before. Also, it is worth to revisit the shot combination strategy to see which shot types should the team sacrifice for 3-point shots.

Conclusion

The 3-point shooting hype is real, as all the teams are shooting more threes than ever before. We do see playoff teams are shooting more longer range shots, and we realize the power of 3-point shots are definitely underestimated before by the league. We also found that shot combination is a very important variable to tell us how well the teams will do offensively, and the teams may want to revisit their shot combination strategy. In this project, we have barely scratched the surface of data exploratory for NBA statistics. More variables may be looked at such as rebounds, turn overs, defensive ratings, and game logs data etc., to make prediction of how well the team will do during the season.

The shiny dashboard can be accessed through here.

About Author

Werner Chao

Werner has been the lead data analyst for KaJin Health (www.kajinonline.com), an online mental health company in Shanghai, and data analyst at SNC-Lavalin, a 7.8 billion dollar public company. He helped KaJin Health analyze web traffic, consumer insights,...
View all posts by Werner Chao >

Related Articles

Leave a Comment

No comments found.

View Posts by Categories


Our Recent Popular Posts


View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI