Data Study on Beer Over the Last Few Decades

Posted on Jun 30, 2019
The skills the author demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.

Introduction

Since prohibition ended in 1933, data shows the American brewing industry has undergone two massive transformations. The first saw hundreds of regional breweries from across the country, often brewing beer unique their respective regions, become consolidated into a handful of behemoths.

During the period of greatest consolidation in the early 1980s the ten largest breweries produced almost 95% of American beer. This naturally led to fewer options and, in the opinion of many, lower quality beer. Fortunately for beer lovers, a renaissance has since swept the American brewing industry. There are now thousands of craft breweries in all fifty states that offer a wide range of styles and are constantly experimenting (Banana Kölsch anyone?).

As someone who enjoys visiting small breweries, I was interested by how this transformation had unfolded both through time and by state. I was able to find data collected by the Alcohol and Tobacco Tax and Trade Bureau that tracked both the number of brewery licenses and the amount of beer produced over the last few decades.  With R and Shiny, I built a dashboard to help visualize this data and hopefully inspire users to support their local brewer.

In addition to the data from the Alcohol and Tobacco Tax and Trade Bureau, I drew upon state population data (used to calculate 'per resident' statistics) from the Federal Reserve Bank of St. Louis. Lastly, the 'Mean ABV' and 'Mean Rating' data comes from a Kaggle.com data set that was originally scraped from BeerAdvocate. Where as the other data comes from highly reputable sources, conclusions based on this data should be taken with a grain of salt. There may be unknown biases, both innocent and nefarious, within the data.

Data

When exploring the data, it is important to consider how the presence of large brewing facilities affects some of the metrics. For example, in 2018 over 90% of beer brewed in Oklahoma was consumed on brewery premises. Whereas less than 1% of corresponding beer was consumed this way in neighboring Texas. This is undoubtably due to the presence of large Anheuser-Busch production facilities in the state which produce high volumes of beer for national sale and dwarf production at small breweries with onsite tap rooms.

I especially enjoyed seeing the explosion of craft breweries reflected in the data. While the Alcohol and Tobacco Tax and Trade Bureau does not collect this data explicitly, various metrics such as 'barrels of beer consumed on a brewery premise' serve as good proxies. It is also interesting look at trends in specific states and investigate how they correspond with regulatory changes.

 

Data Study on Beer Over the Last Few Decades

Click here to view the shiny app and explore the data. Or click here to visit my GitHub and check out the code. 

 

About Author

Related Articles

Leave a Comment

No comments found.

View Posts by Categories


Our Recent Popular Posts


View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI