NYC Data Science Academy| Blog
Bootcamps
Lifetime Job Support Available Financing Available
Bootcamps
Data Science with Machine Learning Flagship πŸ† Data Analytics Bootcamp Artificial Intelligence Bootcamp New Release πŸŽ‰
Free Lesson
Intro to Data Science New Release πŸŽ‰
Find Inspiration
Find Alumni with Similar Background
Job Outlook
Occupational Outlook Graduate Outcomes Must See πŸ”₯
Alumni
Success Stories Testimonials Alumni Directory Alumni Exclusive Study Program
Courses
View Bundled Courses
Financing Available
Bootcamp Prep Popular πŸ”₯ Data Science Mastery Data Science Launchpad with Python View AI Courses Generative AI for Everyone New πŸŽ‰ Generative AI for Finance New πŸŽ‰ Generative AI for Marketing New πŸŽ‰
Bundle Up
Learn More and Save More
Combination of data science courses.
View Data Science Courses
Beginner
Introductory Python
Intermediate
Data Science Python: Data Analysis and Visualization Popular πŸ”₯ Data Science R: Data Analysis and Visualization
Advanced
Data Science Python: Machine Learning Popular πŸ”₯ Data Science R: Machine Learning Designing and Implementing Production MLOps New πŸŽ‰ Natural Language Processing for Production (NLP) New πŸŽ‰
Find Inspiration
Get Course Recommendation Must Try πŸ’Ž An Ultimate Guide to Become a Data Scientist
For Companies
For Companies
Corporate Offerings Hiring Partners Candidate Portfolio Hire Our Graduates
Students Work
Students Work
All Posts Capstone Data Visualization Machine Learning Python Projects R Projects
Tutorials
About
About
About Us Accreditation Contact Us Join Us FAQ Webinars Subscription An Ultimate Guide to
Become a Data Scientist
    Login
NYC Data Science Acedemy
Bootcamps
Courses
Students Work
About
Bootcamps
Bootcamps
Data Science with Machine Learning Flagship
Data Analytics Bootcamp
Artificial Intelligence Bootcamp New Release πŸŽ‰
Free Lessons
Intro to Data Science New Release πŸŽ‰
Find Inspiration
Find Alumni with Similar Background
Job Outlook
Occupational Outlook
Graduate Outcomes Must See πŸ”₯
Alumni
Success Stories
Testimonials
Alumni Directory
Alumni Exclusive Study Program
Courses
Bundles
financing available
View All Bundles
Bootcamp Prep
Data Science Mastery
Data Science Launchpad with Python NEW!
View AI Courses
Generative AI for Everyone
Generative AI for Finance
Generative AI for Marketing
View Data Science Courses
View All Professional Development Courses
Beginner
Introductory Python
Intermediate
Python: Data Analysis and Visualization
R: Data Analysis and Visualization
Advanced
Python: Machine Learning
R: Machine Learning
Designing and Implementing Production MLOps
Natural Language Processing for Production (NLP)
For Companies
Corporate Offerings
Hiring Partners
Candidate Portfolio
Hire Our Graduates
Students Work
All Posts
Capstone
Data Visualization
Machine Learning
Python Projects
R Projects
About
Accreditation
About Us
Contact Us
Join Us
FAQ
Webinars
Subscription
An Ultimate Guide to Become a Data Scientist
Tutorials
Data Analytics
  • Learn Pandas
  • Learn NumPy
  • Learn SciPy
  • Learn Matplotlib
Machine Learning
  • Boosting
  • Random Forest
  • Linear Regression
  • Decision Tree
  • PCA
Interview by Companies
  • JPMC
  • Google
  • Facebook
Artificial Intelligence
  • Learn Generative AI
  • Learn ChatGPT-3.5
  • Learn ChatGPT-4
  • Learn Google Bard
Coding
  • Learn Python
  • Learn SQL
  • Learn MySQL
  • Learn NoSQL
  • Learn PySpark
  • Learn PyTorch
Interview Questions
  • Python Hard
  • R Easy
  • R Hard
  • SQL Easy
  • SQL Hard
  • Python Easy
Data Science Blog > R > NFL Dashboard: Play-by-Play Data into Actionable Insights

NFL Dashboard: Play-by-Play Data into Actionable Insights

Matt Savoca
Posted on Apr 28, 2019

The NFL-Dashboard Shiny App || The Author's GitHub

THE NFL’s β€œGAME WITHIN THE GAME”

On the heels of the 2019 NFL Draft, the popularity of the National Football League seems to have no bounds. Audiences continue to consume NFL content in absurd numbers, and many fans find tremendous enjoyment in predicting their favorite team's scores and stats. For decades, sports speculators have been placing bets in Las Vegas or Atlantic City, and a 2018 Supreme Court ruling will likely pave the way for nationwide sports gambling legalization.

Currently however, the most popular method of football prediction is fantasy football, the NFL’s β€œgame within the game.” Here, players compete to set the best lineup of NFL stars, which are then scored based on their chosen players’ on-field productivity, by counting stats such as catches, yards and touchdowns.

INFORMATION OVERLOAD

With the advent of β€œDaily Fantasy Sports” in the 2010s, the ability to correctly predict NFL statistics became even more lucrative. As online venues began featuring tournaments with thousands of participants, competing for millions of dollars in cash prizes, bettors began to find themself entrenched in an information "war" - even while predicting a game as chaotic and random as football, competitors with the most, and best, data, often won.

As the number of competitors in these tournaments increased, so did the businesses catering to these individuals. Dozens of premium recommendation services and Fantasy Sports Gurus flooded the market, adding an additional layer of noise participants needed to sift-through before finding actionable information.

But through that noise also emerged a litany of legitimate research. Savvy analysts began employing statistically rigorous methods to derive metrics that helped predict NFL success on a week-to-week scale. Fantasy analytics sites such as RotoViz.com and PlayerProfiler.com gained a cult following of β€œNFL nerds” who found solace in understanding the underlying principles of standout NFL performance.

Each miniature revelation flew in the face of the tape-grinders' (as film-focused football scouts lovingly refer to themselves) longstanding notion that the best way to determine future NFL production was to study vast amounts of player film for nuanced, domain and situation-specific proficiencies, called 'talent,' that couldn't possibly be explained or measured.

 

A SIMPLE SOLUTION

In a reactionary move, a class of "metrics heads" has emerged on platforms like Twitter, where users fiend for more and more information, without questioning the data's accuracy or efficacy. In just the last 36 months, a slew of new, data-driven, interactive football content has once again re-complicated the NFL analysis sphere.

But splashy has replaced simple. Superfluous statistics are abound. I decided what was most needed was a way to help competitors cut through the noise by building a tool of my own, one that solely focused on the most predictive metrics for teams and players, with the goal of helping myself and others improve their decision making in NFL speculation games and fantasy football.

THE DATA

I used data from the nflscrapR-Data repository on GitHub. This is a reformatted version of the play-by-play stats publicly available on NFL.com from 2009 through 2018. It includes over 250 unique variables per play from the past 2,560 regular season games, amounting to over 50 Million observations related to the last decade of NFL regulation play.

While the dataset expectedly includes myriad game state and play results variables, most interestingly, the data also includes yards the ball travelled in the air (referred to as Air Yards), as well as each team’s Expected Points and Win Probability1 at that particular moment in the game. Related to these final two metrics, the data includes Win Probability Added and Expected Points Added, which denotes the play result’s change in each team’s respective Win Probability and Expected Points.

Finally, the dataset includes a roster dataset, limited to teams’ quarterbacks (QB) and the three football β€œskill positions”: running backs (RB), wide receivers (WR), and tight ends (TE). These four positions are often collectively referred to as the fantasy-relevant positions, as they are the only positions used in fantasy football. The limitation of the roster data indicated that fantasy football might be a great first use case for this information. 

THE APP

NFL-Dashboard

Using R and the Shiny package, I built an app with the goal of a) distilling play-by-play data into actionable player-level and team-level insights, and b) serving as a contextual companion when re-watching or studying a game. For the included player-level research, I relied heavily on previous research and domain knowledge2, so that the player-level aggregations would only focus on advanced metrics that have been shown to have more predictive power than raw productivity stats (such as Yards Gained, Touchdowns, Catches, etc).

Additionally, because there are often situations where NFL players are either not playing or not available due to competition-specific restrictions, I recognized the necessity of filtering players from all comparative analysis pages. These filters are constantly available in the sidebar.

Game Rewind

game-rewind-win-probability

The Game Rewind page displays each Team's Win Probability at each particular moment of the game. 

 

The Game Rewind page is most helpful when used alongside footage of an NFL game. Users can choose any team’s game dating back to Week 1 of the 2009 regular season, and easily visualize the most meaningful plays in the game’s outcome by observing each team’s change in Win Probability as the game moves towards its conclusion. Hovering the mouse over the graph at any moment on the timeline, users can also read a description of the play’s result.

Leaguewide Trend Explorer

 

In the Leaguewide Trend Explorer, the user is able to perform basic league-level analysis, including the ability to visualize the impact that different play types have in determining a team’s Win Probability and Expected Points. It includes additional graphs to see how effective the league as a whole has been at running vs. passing over the user defined timeframe.

 

Observing the distribution of Expected Points Added for Runs versus Passes over last 8 weeks of the 2018 season, the average passing play has seen more volatile results, but carried more upside than the typical running play. If you want exactly 0 Expected Points, though, running the ball is a fantastic choice.

Punting is.. not a great decision if you like scoring points.

 

Team Efficiency Analysis

On the Team Efficiency page, users observe team level per-play efficiency over a determined period of weeks in the past. The graphs default to observing the most recent half-season, and include multiple metrics that illustrate how each team has fared in both efficiency accumulated (by their offense) and efficiency allowed (by their defense).

The metrics include the aforementioned Expected Points Added (EPA) and Win Probability Added (WPA), a variant of Yards per Attempt (AYA) and a duo of Air Yards based efficiency metrics popularized by FiveThirtyEight’s Josh Hermsemeyer, Passing Air Conversion Ratio (PACR) and it’s variant, aPACR, which measures how often a yard thrown in the air is converted into yards gained, with specific multipliers given to especially positive or negative outcomes in the latter metric.3

 

User's can view the movement of a team efficiency from week to week, or play an animation of all the weeks in a given timeframe.

 

Quarterback Analysis

The first of two player-level analysis pages focuses strictly on the quarterbacks (QB) and purposefully ignores raw opportunity. 4 With the exception of the β€œTotal Yards" tab, the entirety of the focus of the Quarterback Analysis page is on per-play efficiency, rather than opportunity or raw production. The same five metrics as the team efficiency tab (EPA, WPA, PACR, aPACR, and AYA) are once again available to the user, as is the ability to change the amount of weeks in the past to aggregate the data.

Unsurprisingly, the player who led all QBs in EPA-based efficiency was also voted the NFL's Most Valuable Player, receiving 41/50 1st place votes. The other nine... went to Drew Brees, second on this list.

 

Skill Position Analysis

Contrary to quarterbacks, the Skill Position Analysis page provides an additional tab: Opportunity. Only after determining the value of a skill player’s opportunity can their efficiency be properly contextualized when predicting productivity. 5 The most important overall statistic for skill players is the percentage of team plays in which they are chosen to receive the ball:

Viewing the players who've received the highest percent of team opportunities over a user-defined timeframe. The league's top running backs often dominate this category.

 

As with previous efficiency tabs, the opportunity metrics are those that have been proven more predictive than raw counting stats (such as rushes, pass targets, or catches), and are presented as percentages of the team’s overall opportunities:

Percentage of Team Total Opportunities, of Team Rushes, of Team Passes, of Team Air Yards, and a variant (also developed by Mr. Hermsmeyer), Weighted Opportunity Rating (WOPR), a ML-derived, weighted combination of a player's percentage of team targets and percentage of team air yards. In the Efficiency tab, the now familiar metrics are once again available6

Users can see the players with the most opportunity, and those who've been most efficient with their opportunity, using a variety of metrics.

The Individual Tab

Within both the Quarterback Analysis and Skill Position Analysis Pages are additional tabs labelled β€œIndividual.” On this page, users can observe a player’s efficiency on a per-play basis, as well as a trendline of that player’s efficiency compared to league average. Hovering the mouse over each plot provides game information and the description for that particular play. This can be extremely helpful in determining if there are certain areas of the field, further or closer to the play’s origin, called the line of scrimmage, where that player is particularly successful.

Users can observe a quarterback or skill player's play-level efficiency vs. league average in a variety of metrics.

GOING FORWARD

I’m thrilled to be releasing version 1.0 of NFL-Dashboard to the public, but that’s exactly what this tool is: a first-pass at aggregating this data effectively. In the future I’d love to add much more data and functionality, while maintaining an interface simple enough to consistently gain actionable insight.

  • The most valuable information we could add to this dataset would be real-time relative athleticism details. While players are individually, publicly, evaluated for athletic ability prior to entering the league, the data gathered in real-time from accelerometers inside the ball and player pads would massively boost the viability of this dataset.
  • That information (which includes player positioning, speed, acceleration, and directional detail on second-by-second basis) is proprietary, and only available to NFL teams. The addition of data of this size would require a dedicated server for the app’s information.
  • Each team’s play speed and decision-making varies drastically depending on how many times they believe they must score to avoid losing the game. Looking at teams in terms of possession differential could help determine clutch players or teams, or others that are opportunistic only when the outcome of the game has been determined.
  • More realistic than the proprietary chip-based data is the inclusion of coach-level and β€œscheme”-level details related to each team, perhaps from a site like ProFootballReference.com. Despite current NFL schemes carrying somewhat vague names like β€œWest Coast” or β€œAir Raid,” each team adheres to a certain set of underlying principles on both offense and defense that they believe optimize their chances of success.7
  • Delineating the differences in these core strategies could help determine which teams and coaches are more or less β€œpredictive” in their play calls, and whether that predictability has an effect on play and game outcomes.
  • Being able to separate offensive line efficiency or deficiency from a player’s ability could drastically improve insights once again.8 A well-respected site, FootballOutsiders.com posts a slew of game-level Offensive Line metrics that could be helpful, even in the aggregate. ESPN Analytics is making strides for play-level blocking efficiency, creating a metric called Block Win Rate (BWR), which could be a great addition.
  • At its core, this version of NFL-Dashboard is an intelligent graphing tool, when ultimately I'd like the tool to do more recommending than graphing. The goal of future versions will be to create competition-specific optimizers that allow users to make direct decisions based on the insights they feel are most valuable.

THANKS!

I can’t thank you enough for taking the time to check out this project. It was an incredibly rewarding experience to try and wrangle this much information into a useful, valuable tool. If you have anything you want more information on, or if you see a big error (hopefully none of those!) don’t hesitate to reach out! You can also check out the source code for the application at my GitHub page.

The NFL-Dashboard Shiny App || The Author's GitHub

Footnotes:
  1. Expected Points and Win Probability, along with EPA, and WPA, have been calculated in myriad ways for the NFL over the years. The versions of the two metrics that are used in this dataset are further explained in nflWAR: A Reproducible Method for Offensive Player Evaluation in Football
  2. It's impossible to list every website that provided invaluable, reproducible research related to predicting football production, but an incomplete list would certainly include: RotoViz.com, ProFootballFocus.com, FootballOutsiders.com,  PlayerProfiler.com, FantasyFootballAnalytics.net, PredictiveFootball.com, and the sadly defunct (since ESPN hired him) AdvancedFootballAnalytics.com.
  3. In 2017, a phenomenal article was written by Mr. Hermsemyer explaining PACR/RACR and WOPR in detail, but unfortunately, RotoWorld.com, the popular fantasy football site that previously hosted the article, tragically destroyed all archived articles in a site overhaul. Mr. Hermsmeyer's personal football information site, AirYards.com, provides additional detail relating to these metrics, though less comprehensive than the aforementioned Rotoworld article; RIP.
  4. Because of the nature of the position, Quarterbacks, along with a team’s coach and play caller have the luxury (or added challenge, depending on how you view it) of choosing the appropriate means of distributing the football on each play. They determine whether to hand the ball off to a runner, tuck it away and run themselves, or, should they pass, who the most open receiver is, and when to release the ball.
  5. As such, efficiency remains the best measure of a quarterback's underlying decision-making. The linked post's author, Ben Baldwin, is a frequent contributor to The Athletic, and has done excellent research with this same play-by-play dataset.
  6. Skill players need to be measured first by their opportunity, then by efficiency. It takes a certain level of ability to a) be chosen by the coaches to be an active, playing member of the team for that play, and then b) it requires *additional* trust in the player’s ability, from both coach and quarterback, to determine that player as the optimal means of distributing the ball. In short, skill position players *do not* choose their own opportunity, so opportunity in itself is, at some level, a measure of skill and talent.
  7. Passer Air Conversion Ratio is renamed Receiver Air Conversion Ratio (RACR, and its variant aRACR) throughout the Skill Player Analysis page.
  8. Further complicating the issue, all NFL schemes are hybrids of multiple schemes from the annals of NFL, college, and high school football history. On each play, offensive coaches determine whether to run or pass, how many of the five skill position players will be running backs (RB) vs. wide receivers (WR) vs. tight ends (TE) and what combination of routes they should run.
  9. Alternatively, defensive coaches make situation-specific personnel and strategic decisions. They decide the proper balance of strength vs. speed, determine how many players rush the quarterback vs. those that stay back and cover a receiver. They decide whether to employ a zone defense or man-to-man coverage against eligible receivers, and finally, choose whether one or more defensive backs will leave the deep middle of the field open or covered from the snap.
  10. The Offensive line's goal is vital: allow the quarterback ample time to optimally distribute the ball, and then, if the ball is distributed to a rusher rather than a receiver, continue to maintain your block for several seconds, or block a new player downfield. It is notoriously difficult to measure on a per-play basis, since the 5-7 blockers are attempting to operate as a unit to clear space for a quarterback or ball-carrier.

About Author

Matt Savoca

Matt Savoca is a sports-obsessed researcher and content producer who lives in New York City. After completing a foundational coursework in statistics and data science in both R and Python, he spends his days parsing, scraping, visualizing, and...
View all posts by Matt Savoca >

Leave a Comment

Cancel reply

You must be logged in to post a comment.

No comments found.

View Posts by Categories

All Posts 2399 posts
AI 7 posts
AI Agent 2 posts
AI-based hotel recommendation 1 posts
AIForGood 1 posts
Alumni 60 posts
Animated Maps 1 posts
APIs 41 posts
Artificial Intelligence 2 posts
Artificial Intelligence 2 posts
AWS 13 posts
Banking 1 posts
Big Data 50 posts
Branch Analysis 1 posts
Capstone 206 posts
Career Education 7 posts
CLIP 1 posts
Community 72 posts
Congestion Zone 1 posts
Content Recommendation 1 posts
Cosine SImilarity 1 posts
Data Analysis 5 posts
Data Engineering 1 posts
Data Engineering 3 posts
Data Science 7 posts
Data Science News and Sharing 73 posts
Data Visualization 324 posts
Events 5 posts
Featured 37 posts
Function calling 1 posts
FutureTech 1 posts
Generative AI 5 posts
Hadoop 13 posts
Image Classification 1 posts
Innovation 2 posts
Kmeans Cluster 1 posts
LLM 6 posts
Machine Learning 364 posts
Marketing 1 posts
Meetup 144 posts
MLOPs 1 posts
Model Deployment 1 posts
Nagamas69 1 posts
NLP 1 posts
OpenAI 5 posts
OpenNYC Data 1 posts
pySpark 1 posts
Python 16 posts
Python 458 posts
Python data analysis 4 posts
Python Shiny 2 posts
R 404 posts
R Data Analysis 1 posts
R Shiny 560 posts
R Visualization 445 posts
RAG 1 posts
RoBERTa 1 posts
semantic rearch 2 posts
Spark 17 posts
SQL 1 posts
Streamlit 2 posts
Student Works 1687 posts
Tableau 12 posts
TensorFlow 3 posts
Traffic 1 posts
User Preference Modeling 1 posts
Vector database 2 posts
Web Scraping 483 posts
wukong138 1 posts

Our Recent Popular Posts

AI 4 AI: ChatGPT Unifies My Blog Posts
by Vinod Chugani
Dec 18, 2022
Meet Your Machine Learning Mentors: Kyle Gallatin
by Vivian Zhang
Nov 4, 2020
NICU Admissions and CCHD: Predicting Based on Data Analysis
by Paul Lee, Aron Berke, Bee Kim, Bettina Meier and Ira Villar
Jan 7, 2020

View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day ChatGPT citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay football gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income industry Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI

NYC Data Science Academy

NYC Data Science Academy teaches data science, trains companies and their employees to better profit from data, excels at big data project consulting, and connects trained Data Scientists to our industry.

NYC Data Science Academy is licensed by New York State Education Department.

Get detailed curriculum information about our
amazing bootcamp!

Please enter a valid email address
Sign up completed. Thank you!

Offerings

  • HOME
  • DATA SCIENCE BOOTCAMP
  • ONLINE DATA SCIENCE BOOTCAMP
  • Professional Development Courses
  • CORPORATE OFFERINGS
  • HIRING PARTNERS
  • About

  • About Us
  • Alumni
  • Blog
  • FAQ
  • Contact Us
  • Refund Policy
  • Join Us
  • SOCIAL MEDIA

    Β© 2025 NYC Data Science Academy
    All rights reserved. | Site Map
    Privacy Policy | Terms of Service
    Bootcamp Application