Data Analysis on NBA & The 3-Point Shot
The skills the author demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.
(Project: https://archiemxx.shinyapps.io/shinyproject)
Introduction
The game of basketball is a fascinating one, one that I deeply love and religiously studied all the data since childhood. Its beauty exists in its fast pace and the perfect balance between teamwork and individual heroism. Basketball is a five-man game, but the undeniable beauty is also found
- when The Black Mamba pump fakes twice, shoots, and ices the game in the last second, sending the opponents home disappointed;
- when Steph Curry jacks up a three 28 feet away, with 20 seconds on the clock; the entire Oracle Arena inhales deeply as the ball travels in the majestic parabolic curve in what seems eternity, and then absolutely erupts, (of course Steph would make that...);
- when King James dominates the game and takes every shred of last hope out of the opposing team, be it Celtics, Raptors, Wizards, Nets, and so many more...
I use this shiny project as an opportunity to analyze the impact of the three-point line on the strategies and the players in the NBA. Three-point shot, as many fans know, was not always in the game. When Wilt and Russell used to play, each field goal counted as two points and free throw one. In the 1979-1980 season, the NBA adopted the three-point line, further from which, a shot would count as 3 points. This change, undeniably, has changed the game to head to toe.
Data & Methods
In this analysis, team and individual data was collected from the first NBA season (1949-1950) to the last completed NBA season (2017-2018). The data sets came as separate data sets and were later combined into two different aggregate data sets: team-wise and player-wise. The data tracks all "first-level" basketball data. "First-level" data denotes data that can be objectively recorded and have a direct impact on the game, such as points, type of points, assists, rebounds, steals, etc.
In more recent years, more advanced statistics have been collected in terms of player impact. For example, how many feet was a certain player away from his opponent when his opponent shoots? Was the field gold a mid-range jumper or a dunk? This level of data does not matter directly as the results are the same, but allows users to have insights into how these results came about. This level of data is not available in public, as it is mostly collected by teams for internal use, such as scouting and coaching. As a result, the analysis only uses data that can be collected for free and is available in public.
The data is imported to R. Various packages are used in the working of the project, such as dplyr, ggplot2, scales, tidyr, plotly, readxl, and of course, shiny.
The project is hosted on https://archiemxx.shinyapps.io/shinyproject/
Allow me to thank you for taking time to view my project.
Data Analysis
Contribution by Different Sources of Points
In the first graph below, the percentages of contribution by different sources of points (three-pointers, two-pointers, and free throws) are graphed by season.
Prior to 1980, three pointers represented 0% of the scoring in the league as it was not yet introduced. After 1980, the three-point line took on an increasingly important role with a road bump in the mid-90s. In 2008, three pointers officially surpassed free throws as the second most important scoring method in the league; and it does not look like its prevalence is declining any time soon. In fact, if the trend persists, by 2030, three points will be the most common increment in a basketball game.
Observations
Both two-pointers and free throws have declined, though for different reasons:
- Two pointers have been directly impacted by the long bomb. Teams re-designed their offense to spend more possessions on shooting triples. Therefore, two pointers represent less of a contribution.
- Free throws, on the other hand, have been rather indirectly impacted. As players shoot more threes, they tend to penetrate to the paint (an area under the basket, being closest to the basket) with a much less frequency. The paint tends to be the most crowded space because it represents the greatest amount of threat to the defense. As a result, players are fouled less and shoot fewer free throws.
On the project page, users could utilize a slider bar to select a year and view the specific points breakdown in that season. I highly encourage you to check it out, as it gives an animation-like effect that shows the uptrend of the three-point popularity. The 1950 season and the most recent 2018 season are shown below:
Three pointers count as 3 points, more than either a two-pointer or a free throw. Since three-pointers are more popular now, an obvious question is that: are NBA teams scoring more points on a per game basis today than they used to?
Average Points Per Team
The following graph adds a gold-colored line onto the first graph. The gold-colored line represents the average total points per team per game in the league, by different seasons. It is clear that the total points have not changed in a significant fashion. In fact, the average total points per team per game, if anything, have declined over some years. The highest average points per team per game happened in the 1970s, a time when the three-pointers did not exist.
When three pointers were firstly introduced, teams actually scored lower, which could be a result of trying to shoot more threes and but not being good at it. However, as teams became better at shooting threes starting from the early 2000s, the points per game also started to trend upwards again.
The project then examines in a given NBA season, the number of threes made vs. the number of threes attempted by each NBA team. The x-axis represents the number of threes attempted while the y-axis represents the number of threes made in a season. A team, realistically, hopes to find itself in the top-right corner of the graph, which represents a high skill of shooting threes and fully utilizing the skill.
The top-left corner represents a high accuracy (aka high skills) but the team has under-utilized the skill. The bottom-left corner represents the team's awareness of its low three-pointer accuracy and its decision not to shoot threes. The bottom-right corner is the worst case scenario, representing a team that is very bad at shooting threes, but somehow decides to heavily rely on it.
Skills vs Usage
Data Findings
The first graph represents 1980, which is the first year NBA teams could shoot threes in a game. The second graph represents the last completed NBA season (2018), when Darryl Morey's team notoriously attempted a whopping 3470 three pointers!!! The team that shot the least number of threes in the 2018 season, Timberwolves, whose offense is designed by Tom Thibodeau, an anti-three coach, and revolves around Karl Anthony-Towns, a versatile big, attempted 1845 threes, 3.4 times as many as the most attempts in 1980, Clippers, 543 attempts!!!
If you are wondering (as I surely did) which dot in 2018 the Golden State Warriors lies, (you should go to my project :)), they are the top pink dot. Yes, 16 teams attempted more threes than the Warriors in the 2017-2018 season!!! This is UNBELIEVABLE. Without the project, my indistinct would have told me that they attempted somewhere in the top 5 in the league. They shot a historic 40% in the season, with only 2370 attempts.
Should they shoot more then? Not necessarily. The Warriors generate their offense primarily through Curry/Durant, Curry/Draymond, and Durant/Draymond pick-and-rolls. This causes them to get easy buckets(points) near the basket. If the defense decides to tighten around the rim, Curry, Durant, and Klay will then severely punish them by shooting those threes at an astonishing accuracy. This just shows that the Warriors play on a whole another level. They are playing with tremendous skills and discipline.
Different Positions
Another notable analysis that the project delves into is the impact of the three-pointers on the positions of basketball. There are five NBA positions: Point Guard(PG), Shooting Guard(SG), Small Forward(SF), Power Forward(PF), and Center(C).
Typically, Point Guards and Shooting Guards are positions where players shoot from outside and organize the offense. These players tend to be better passers and have better ball control. Kobe is a Shooting Guard and Steph is a Point Guard. Whereas, Power Forwards and Centers fight for rebounds or shoot from inside. They are big, tall, and very strong. Shaq was a Center, and possibly the best one at that. The project allows users to select from the following three metrics (number of threes attempted, number of threes made, and the percentages), then graphs them by positions from 1980 to 2018.
Findings
Without surprise, the number-attempted graph looks extremely similar to the number-made graph. In the NBA, a player simply won't be allowed to jack up a bunch of missed shots over a long-term basis.
Another result expected is that PG, SG, and SF are the three positions that have traditionally shot more threes, while PF and C have lagged behind. This is reasonable. However, as we can tell from the graph, PF's are catching up, quickly. In fact, in the most recent NBA season, Power Forwards attempted and made almost as many threes as Point Guards and Small Forwards.
They also proved that they deserved this many attempts as their percentages looked on par to those of other positions. Centers have been shooting considerably fewer threes but when they do, their percentages looked very good recently. Players like KAT, Jokic, Embiid all spend much their off-season practicing shooting beyond the arch.
Players
The project also examines the most accurate and trigger-happy shooters in a certain year. Users could select a year from 1980 to 2018, and a graph will appear, showing the top 20 shooters' number of attempts OR number of makes with their percentages. For example in 2016, the 20 players who attempted the most number of threes are shown as follows:
The top-right corner, once again represents where a player wishes to find himself. X-axis shows the percentage (accuracy) of the player's three pointers, and the y-axis shows the number of attempts per game of the player. Steph Curry attempted 11.2 threes per game, making 45.4% of them. Klay Thompson shot the second best percentage, even though he attempted relatively fewer threes. Kobe finds himself alone at the bottom. Lakers were very tolerant in his farewell year, allowing him to shoot very inefficiently beyond the arch in high volume.
The project examines how the dominant strategy of the game has changed over the years. The following graph shows the shots taken by players who have started over 41 games in that particular season. These players started over half of the games in a season. They can be regarded as the more important players in the league at each given time. Therefore, the strategies that these players choose to adopt are very impactful as they likely represent an overall shift in how the game of basketball is being played in general. The graph takes the ratio of three pointer attempts to two pointer attempts and graphs them over the years.
Three Point to Two Point Ratio
It is exceedingly clear that the ratio of three point attempt to two point attempt per game has shifted upwards, at an increasing pace. In the late 80s and early 90s, the best players in the league shot over 10 two-pointers for every three they shot. Today, the best players shoot close to 6 threes per every 10 two-point shots.
The attempts line is closely correlated to the made line, as we would expect. However, as these two lines both trend up, they began to diverge. This is an important observation. The implication is that in the old days, only players who were very good at shooting threes would shoot them. Today, it is more of a dominant strategy. Many mediocre three-point shooters are also shooting a large volume of threes.
Since we have established that the game has changed in terms of strategies and we also learned that different positions have different roles and playing styles, let's investigate if the popularity of three-point shots has benefited some positions more than it did others. It is easy to hypothesize that guards and small forwards benefited the most from the shift in playing style.
There are two ways that a player could be benefitted. The obvious one is more points. We filter the top 20% scorers in each season since the adoption of the three-point shot in NBA. The graph shows the percentage representations of all five positions - AKA: how many top scorers are guards, and how many are Centers?
Another way to benefit is through playing time. If a player has heavy minutes, he likely makes more money. In this sense, we filter the top 20% minutes-played players in each season since 1980 and see if the position make-ups have changed dramatically.
Minutes Per Game
Analysis
Firstly, the two graphs look awfully similar. It is expected if a player plays a lot of minutes in the most competitive league, he tends to score more points.
Secondly, the trend that jumps out is the popularity of the Point Guards. This is closely tied to the use of three-pointer as basketball traditionally has required height and strength, two weakness of stereotypical PG's. However, shooting threes does not require the two traits. All it requires is to use the pick-and-roll correctly and bomb from 24 ft or further. Under this environment, the Point Guards have thrived.
Thirdly, Power Forwards have held up consistently well in the league. Centers have disappeared for awhile in the mid-2000. However, recently, Centers have made a resurgence. However, true basketball fans know that the definition of a Center has changed. Centers today play more like a power forward in the old days. This has much to do with the three-point line as well. (This is illustrated in the Position page)
Conclusion
Funnily enough, in both representations, we see 'a death of the Small Forwards. In terms of points scored, the one-time favorite position in the 1990s has declined in a dramatic fashion, now ranks dead-last among all five positions. In terms of minutes per game, Small Forwards' representation also shrank, but to a much less extent. To fully understand the reason to this, another analysis may be necessary.
However, being someone who has watched the game religiously since 6 years old, I would like to offer an explanation: In the old days of the NBA, wings (also known as Small Forwards), representing a balance between mobility and strength, took a necessary role on the defensive end of the floor. And since three point shot was not the prevalent strategy, small forwards scored more points.
However, since the game has gotten further from the rim and as we see, Point Guards now represent more of the best scorers, Small Forwards have fewer chances to score. However, in terms of minutes played, teams still need/value Small Forwards' service, because their mobility and length allow them to defend the likes of Steph Curry, Harden, or Durant.
Wishful Improvements
I believe the project tells a very in-depth and complete story of the three-point shot.
However, I would love to improve the understanding and analysis with more advanced statistics.
For example:
- Did three-point shot change the type of two-point shots, e.g dunks, mid-range jump shots?
- Did three-point shot change the salary structure of a team? Who makes more money: three-point shooters or two-point shooters?
- Did three-point shot change how defense is being played in the NBA?