NFL Dashboard: Play-by-Play Data into Actionable Insights
The NFL-Dashboard Shiny App || The Author's GitHub
THE NFLβs βGAME WITHIN THE GAMEβ
On the heels of the 2019 NFL Draft, the popularity of the National Football League seems to have no bounds. Audiences continue to consume NFL content in absurd numbers, and many fans find tremendous enjoyment in predicting their favorite team's scores and stats. For decades, sports speculators have been placing bets in Las Vegas or Atlantic City, and a 2018 Supreme Court ruling will likely pave the way for nationwide sports gambling legalization.
Currently however, the most popular method of football prediction is fantasy football, the NFLβs βgame within the game.β Here, players compete to set the best lineup of NFL stars, which are then scored based on their chosen playersβ on-field productivity, by counting stats such as catches, yards and touchdowns.
INFORMATION OVERLOAD
With the advent of βDaily Fantasy Sportsβ in the 2010s, the ability to correctly predict NFL statistics became even more lucrative. As online venues began featuring tournaments with thousands of participants, competing for millions of dollars in cash prizes, bettors began to find themself entrenched in an information "war" - even while predicting a game as chaotic and random as football, competitors with the most, and best, data, often won.
As the number of competitors in these tournaments increased, so did the businesses catering to these individuals. Dozens of premium recommendation services and Fantasy Sports Gurus flooded the market, adding an additional layer of noise participants needed to sift-through before finding actionable information.
But through that noise also emerged a litany of legitimate research. Savvy analysts began employing statistically rigorous methods to derive metrics that helped predict NFL success on a week-to-week scale. Fantasy analytics sites such as RotoViz.com and PlayerProfiler.com gained a cult following of βNFL nerdsβ who found solace in understanding the underlying principles of standout NFL performance.
Each miniature revelation flew in the face of the tape-grinders' (as film-focused football scouts lovingly refer to themselves) longstanding notion that the best way to determine future NFL production was to study vast amounts of player film for nuanced, domain and situation-specific proficiencies, called 'talent,' that couldn't possibly be explained or measured.
Β
A SIMPLE SOLUTION
In a reactionary move, a class of "metrics heads" has emerged on platforms like Twitter, where users fiend for more and more information, without questioning the data's accuracy or efficacy. In just the last 36 months, a slew of new, data-driven, interactive football content has once again re-complicated the NFL analysis sphere.
But splashy has replaced simple. Superfluous statistics are abound. I decided what was most needed was a way to help competitors cut through the noise by building a tool of my own, one that solely focused on the most predictive metrics for teams and players, with the goal of helping myself and others improve their decision making in NFL speculation games and fantasy football.
THE DATA
I used data from the nflscrapR-Data repository on GitHub. This is a reformatted version of the play-by-play stats publicly available on NFL.com from 2009 through 2018. It includes over 250 unique variables per play from the past 2,560 regular season games, amounting to over 50 Million observations related to the last decade of NFL regulation play.
While the dataset expectedly includes myriad game state and play results variables, most interestingly, the data also includes yards the ball travelled in the air (referred to as Air Yards), as well as each teamβs Expected Points and Win Probability1 at that particular moment in the game. Related to these final two metrics, the data includes Win Probability Added and Expected Points Added, which denotes the play resultβs change in each teamβs respective Win Probability and Expected Points.
Finally, the dataset includes a roster dataset, limited to teamsβ quarterbacks (QB) and the three football βskill positionsβ: running backs (RB), wide receivers (WR), and tight ends (TE). These four positions are often collectively referred to as the fantasy-relevant positions, as they are the only positions used in fantasy football. The limitation of the roster data indicated that fantasy football might be a great first use case for this information.Β
THE APP
NFL-Dashboard
Using R and the Shiny package, I built an app with the goal of a) distilling play-by-play data into actionable player-level and team-level insights, and b) serving as a contextual companion when re-watching or studying a game. For the included player-level research, I relied heavily on previous research and domain knowledge2, so that the player-level aggregations would only focus on advanced metrics that have been shown to have more predictive power than raw productivity stats (such as Yards Gained, Touchdowns, Catches, etc).
Additionally, because there are often situations where NFL players are either not playing or not available due to competition-specific restrictions, I recognized the necessity of filtering players from all comparative analysis pages. These filters are constantly available in the sidebar.
Game Rewind
Β
The Game Rewind page is most helpful when used alongside footage of an NFL game. Users can choose any teamβs game dating back to Week 1 of the 2009 regular season, and easily visualize the most meaningful plays in the gameβs outcome by observing each teamβs change in Win Probability as the game moves towards its conclusion. Hovering the mouse over the graph at any moment on the timeline, users can also read a description of the playβs result.
Leaguewide Trend Explorer
Β
In the Leaguewide Trend Explorer, the user is able to perform basic league-level analysis, including the ability to visualize the impact that different play types have in determining a teamβs Win Probability and Expected Points. It includes additional graphs to see how effective the league as a whole has been at running vs. passing over the user defined timeframe.
Β

Observing the distribution of Expected Points Added for Runs versus Passes over last 8 weeks of the 2018 season, the average passing play has seen more volatile results, but carried more upside than the typical running play. If you want exactly 0 Expected Points, though, running the ball is a fantastic choice.
Β
Team Efficiency Analysis
On the Team Efficiency page, users observe team level per-play efficiency over a determined period of weeks in the past. The graphs default to observing the most recent half-season, and include multiple metrics that illustrate how each team has fared in both efficiency accumulated (by their offense) and efficiency allowed (by their defense).
The metrics include the aforementioned Expected Points AddedΒ (EPA) and Win Probability Added (WPA), a variant of Yards per Attempt (AYA) and a duo of Air Yards based efficiency metrics popularized by FiveThirtyEightβs Josh Hermsemeyer, Passing Air Conversion Ratio (PACR) and itβs variant, aPACR, which measures how often a yard thrown in the air is converted into yards gained, with specific multipliers given to especially positive or negative outcomes in the latter metric.3
Β

User's can view the movement of a team efficiency from week to week, or play an animation of all the weeks in a given timeframe.
Β
Quarterback Analysis
The first of two player-level analysis pages focuses strictly on the quarterbacks (QB) and purposefully ignores raw opportunity. 4 With the exception of the βTotal Yards" tab, the entirety of the focus of the Quarterback Analysis page is on per-play efficiency, rather than opportunity or raw production. The same five metrics as the team efficiency tab (EPA, WPA, PACR, aPACR, and AYA) are once again available to the user, as is the ability to change the amount of weeks in the past to aggregate the data.

Unsurprisingly, the player who led all QBs in EPA-based efficiency was also voted the NFL's Most Valuable Player, receiving 41/50 1st place votes. The other nine... went to Drew Brees, second on this list.
Β
Skill Position Analysis
Contrary to quarterbacks, the Skill Position Analysis page provides an additional tab: Opportunity.Β Only after determining the value of a skill playerβs opportunity can their efficiency be properly contextualized when predicting productivity. 5 The most important overall statistic for skill players is the percentage of team plays in which they are chosen to receive the ball:

Viewing the players who've received the highest percent of team opportunities over a user-defined timeframe. The league's top running backs often dominate this category.
As with previous efficiency tabs, the opportunity metrics are those that have been proven more predictive than raw counting stats (such as rushes, pass targets, or catches), and are presented as percentages of the teamβs overall opportunities:
Percentage of Team Total Opportunities, of Team Rushes, of Team Passes, of Team Air Yards, and a variant (also developed by Mr. Hermsmeyer), Weighted Opportunity Rating (WOPR), a ML-derived, weighted combination of a player's percentage of team targets and percentage of team air yards. In the Efficiency tab, the now familiar metrics are once again available6

Users can see the players with the most opportunity, and those who've been most efficient with their opportunity, using a variety of metrics.
The Individual Tab
Within both the Quarterback Analysis and Skill Position Analysis Pages are additional tabs labelled βIndividual.β On this page, users can observe a playerβs efficiency on a per-play basis, as well as a trendline of that playerβs efficiency compared to league average. Hovering the mouse over each plot provides game information and the description for that particular play. This can be extremely helpful in determining if there are certain areas of the field, further or closer to the playβs origin, called the line of scrimmage, where that player is particularly successful.

Users can observe a quarterback or skill player's play-level efficiency vs. league average in a variety of metrics.
GOING FORWARD
Iβm thrilled to be releasing version 1.0 of NFL-Dashboard to the public, but thatβs exactly what this tool is: a first-pass at aggregating this data effectively. In the future Iβd love to add much more data and functionality, while maintaining an interface simple enough to consistently gain actionable insight.
- The most valuable information we could add to this dataset would be real-time relative athleticism details. While players are individually, publicly, evaluated for athletic ability prior to entering the league, the data gathered in real-time from accelerometers inside the ball and player pads would massively boost the viability of this dataset.
- That information (which includes player positioning, speed, acceleration, and directional detail on second-by-second basis) is proprietary, and only available to NFL teams. The addition of data of this size would require a dedicated server for the appβs information.
- Each teamβs play speed and decision-making varies drastically depending on how many times they believe they must score to avoid losing the game. Looking at teams in terms of possession differential could help determine clutch players or teams, or others that are opportunistic only when the outcome of the game has been determined.
- More realistic than the proprietary chip-based data is the inclusion of coach-level and βschemeβ-level details related to each team, perhaps from a site like ProFootballReference.com. Despite current NFL schemes carrying somewhat vague names like βWest Coastβ or βAir Raid,β each team adheres to a certain set of underlying principles on both offense and defense that they believe optimize their chances of success.7
- Delineating the differences in these core strategies could help determine which teams and coaches are more or less βpredictiveβ in their play calls, and whether that predictability has an effect on play and game outcomes.
- Being able to separate offensive line efficiency or deficiency from a playerβs ability could drastically improve insights once again.8 A well-respected site, FootballOutsiders.com posts a slew of game-level Offensive Line metrics that could be helpful, even in the aggregate. ESPN Analytics is making strides for play-level blocking efficiency, creating a metric called Block Win Rate (BWR), which could be a great addition.
- At its core, this version of NFL-Dashboard is an intelligent graphing tool, when ultimately I'd like the tool to do more recommending than graphing. The goal of future versions will be to create competition-specific optimizers that allow users to make direct decisions based on the insights they feel are most valuable.
THANKS!
I canβt thank you enough for taking the time to check out this project. It was an incredibly rewarding experience to try and wrangle this much information into a useful, valuable tool. If you have anything you want more information on, or if you see a big error (hopefully none of those!) donβt hesitate to reach out! You can also check out the source code for the application at my GitHub page.
The NFL-Dashboard Shiny App || The Author's GitHub
Footnotes:
- Expected Points and Win Probability, along with EPA, and WPA, have been calculated in myriad ways for the NFL over the years. The versions of the two metrics that are used in this dataset are further explained in nflWAR: A Reproducible Method for Offensive Player Evaluation in Football
- It's impossible to list every website that provided invaluable, reproducible research related to predicting football production, but an incomplete list would certainly include: RotoViz.com, ProFootballFocus.com, FootballOutsiders.com,Β PlayerProfiler.com, FantasyFootballAnalytics.net, PredictiveFootball.com, and the sadly defunct (since ESPN hired him) AdvancedFootballAnalytics.com.
- In 2017, a phenomenal article was written by Mr. Hermsemyer explaining PACR/RACR and WOPR in detail, but unfortunately, RotoWorld.com, the popular fantasy football site that previously hosted the article, tragically destroyed all archived articles in a site overhaul. Mr. Hermsmeyer's personal football information site, AirYards.com, provides additional detail relating to these metrics, though less comprehensive than the aforementioned Rotoworld article; RIP.
- Because of the nature of the position, Quarterbacks, along with a teamβs coach and play caller have the luxury (or added challenge, depending on how you view it) of choosing the appropriate means of distributing the football on each play. They determine whether to hand the ball off to a runner, tuck it away and run themselves, or, should they pass, who the most open receiver is, and when to release the ball.
- As such, efficiency remains the best measure of a quarterback's underlying decision-making. The linked post's author, Ben Baldwin, is a frequent contributor to The Athletic, and has done excellent research with this same play-by-play dataset.
- Skill players need to be measured first by their opportunity, then by efficiency. It takes a certain level of ability to a) be chosen by the coaches to be an active, playing member of the team for that play, and then b) it requires *additional* trust in the playerβs ability, from both coach and quarterback, to determine that player as the optimal means of distributing the ball. In short, skill position players *do not* choose their own opportunity, so opportunity in itself is, at some level, a measure of skill and talent.
- Passer Air Conversion Ratio is renamed Receiver Air Conversion Ratio (RACR, and its variant aRACR) throughout the Skill Player Analysis page.
- Further complicating the issue, all NFL schemes are hybrids of multiple schemes from the annals of NFL, college, and high school football history. On each play, offensive coaches determine whether to run or pass, how many of the five skill position players will be running backs (RB) vs. wide receivers (WR) vs. tight ends (TE) and what combination of routes they should run.
- Alternatively, defensive coaches make situation-specific personnel and strategic decisions. They decide the proper balance of strength vs. speed, determine how many players rush the quarterback vs. those that stay back and cover a receiver. They decide whether to employ a zone defense or man-to-man coverage against eligible receivers, and finally, choose whether one or more defensive backs will leave the deep middle of the field open or covered from the snap.
- The Offensive line's goal is vital: allow the quarterback ample time to optimally distribute the ball, and then, if the ball is distributed to a rusher rather than a receiver, continue to maintain your block for several seconds, or block a new player downfield. It is notoriously difficult to measure on a per-play basis, since the 5-7 blockers are attempting to operate as a unit to clear space for a quarterback or ball-carrier.