NYC Data Science Academy| Blog
Bootcamps
Lifetime Job Support Available Financing Available
Bootcamps
Data Science with Machine Learning Flagship ๐Ÿ† Data Analytics Bootcamp Artificial Intelligence Bootcamp New Release ๐ŸŽ‰
Free Lesson
Intro to Data Science New Release ๐ŸŽ‰
Find Inspiration
Find Alumni with Similar Background
Job Outlook
Occupational Outlook Graduate Outcomes Must See ๐Ÿ”ฅ
Alumni
Success Stories Testimonials Alumni Directory Alumni Exclusive Study Program
Courses
View Bundled Courses
Financing Available
Bootcamp Prep Popular ๐Ÿ”ฅ Data Science Mastery Data Science Launchpad with Python View AI Courses Generative AI for Everyone New ๐ŸŽ‰ Generative AI for Finance New ๐ŸŽ‰ Generative AI for Marketing New ๐ŸŽ‰
Bundle Up
Learn More and Save More
Combination of data science courses.
View Data Science Courses
Beginner
Introductory Python
Intermediate
Data Science Python: Data Analysis and Visualization Popular ๐Ÿ”ฅ Data Science R: Data Analysis and Visualization
Advanced
Data Science Python: Machine Learning Popular ๐Ÿ”ฅ Data Science R: Machine Learning Designing and Implementing Production MLOps New ๐ŸŽ‰ Natural Language Processing for Production (NLP) New ๐ŸŽ‰
Find Inspiration
Get Course Recommendation Must Try ๐Ÿ’Ž An Ultimate Guide to Become a Data Scientist
For Companies
For Companies
Corporate Offerings Hiring Partners Candidate Portfolio Hire Our Graduates
Students Work
Students Work
All Posts Capstone Data Visualization Machine Learning Python Projects R Projects
Tutorials
About
About
About Us Accreditation Contact Us Join Us FAQ Webinars Subscription An Ultimate Guide to
Become a Data Scientist
    Login
NYC Data Science Acedemy
Bootcamps
Courses
Students Work
About
Bootcamps
Bootcamps
Data Science with Machine Learning Flagship
Data Analytics Bootcamp
Artificial Intelligence Bootcamp New Release ๐ŸŽ‰
Free Lessons
Intro to Data Science New Release ๐ŸŽ‰
Find Inspiration
Find Alumni with Similar Background
Job Outlook
Occupational Outlook
Graduate Outcomes Must See ๐Ÿ”ฅ
Alumni
Success Stories
Testimonials
Alumni Directory
Alumni Exclusive Study Program
Courses
Bundles
financing available
View All Bundles
Bootcamp Prep
Data Science Mastery
Data Science Launchpad with Python NEW!
View AI Courses
Generative AI for Everyone
Generative AI for Finance
Generative AI for Marketing
View Data Science Courses
View All Professional Development Courses
Beginner
Introductory Python
Intermediate
Python: Data Analysis and Visualization
R: Data Analysis and Visualization
Advanced
Python: Machine Learning
R: Machine Learning
Designing and Implementing Production MLOps
Natural Language Processing for Production (NLP)
For Companies
Corporate Offerings
Hiring Partners
Candidate Portfolio
Hire Our Graduates
Students Work
All Posts
Capstone
Data Visualization
Machine Learning
Python Projects
R Projects
About
Accreditation
About Us
Contact Us
Join Us
FAQ
Webinars
Subscription
An Ultimate Guide to Become a Data Scientist
Tutorials
Data Analytics
  • Learn Pandas
  • Learn NumPy
  • Learn SciPy
  • Learn Matplotlib
Machine Learning
  • Boosting
  • Random Forest
  • Linear Regression
  • Decision Tree
  • PCA
Interview by Companies
  • JPMC
  • Google
  • Facebook
Artificial Intelligence
  • Learn Generative AI
  • Learn ChatGPT-3.5
  • Learn ChatGPT-4
  • Learn Google Bard
Coding
  • Learn Python
  • Learn SQL
  • Learn MySQL
  • Learn NoSQL
  • Learn PySpark
  • Learn PyTorch
Interview Questions
  • Python Hard
  • R Easy
  • R Hard
  • SQL Easy
  • SQL Hard
  • Python Easy
Data Science Blog > Big Data > Data Analysis of Iowa Liquor Sales in College Towns

Data Analysis of Iowa Liquor Sales in College Towns

Julianna Douglas
Posted on Jun 8, 2022

The skills the authors demonstrated here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.

Introduction

College binge drinking culture is a notable part of the college experience in America. Data shows that college students have a higher prevalence of occasions of heavy drinking, and are also more frequently intoxicated than their non-college counterparts.

In order to determine what kind of impact college students have on the liquor market, I selected the top three college towns in Iowa and compared liquor sales in these towns to liquor sales in non-college towns. I analyzed the volume, product, and price distributions, as well as the seasonality of products and prices, of liquor store orders from 2018 to the present using Python. The insights provided from this analysis can serve as a guide for liquor store owners in college towns to help determine what products to stock and when to stock them.

The Data

The Iowa Liquor Sales dataset, obtained from the Alcoholic Beverages Division of the Iowa Department of Commerce, contains purchasing information of liquor products by Iowa Class "E" liquor licensees. The data set contained 23.3 million observations, each of which is a transaction between a liquor store and its liquor vendor from January 1, 2012 to the present. Since the dataset was obtained in March 2022, it contained data up through February 28, 2022.

However, in an effort to cut down on size and only utilize relevant data, I filtered out all transactions from before 2018, resulting in around ten million observations. Liquor trends from five to ten years ago most likely do not hold significant impact or provide real insight into trends in the present day, especially if one considers the vast changes in alcohol products in recent years, such as the popularization of hard seltzers.

Relational Database

For each observation in the dataset, there were 24 variables that detailed information about the transaction, such as store and vendor information, bottle size, bottle price, sales date, etc. Since the dataset was so massive, I created a relational database to better organize the data and to make it easier to analyze. The database contained seven tables: transactions, counties, vendors, products, prices, stores, and liquor categories.

data

Volume Distribution Data

data

First, I sought to determine if there is a significant difference in liquor bottle volume in college town vs non-college town liquor stores. The boxplot above creates a useful visualization for comparing the volume distributions across different liquor categories and the two types of towns, but it does not tell the whole story. The different bottle volumes are in discrete categories (500 ml, 750 ml, 1000 ml, etc.), essentially splitting up the data into groups. So even if the boxplots between the two towns look the same, that does not mean that the distributions are actually the same.

In order to determine whether the two types of towns differ significantly or not, I conducted T-tests for each of the ten liquor categories listed in the boxplot above. Every single T-test resulted in a rejection of the null hypothesis, meaning that the mean bottle volumes differ significantly between college and non-college towns across all ten liquor categories.

I found that mean bottle volumes were higher in non-college towns in every liquor category except for Unknown after further investigation (to clarify, the Unknown category consisted of various unspecified liquors, as well as many seasonal drinks). Although there is no clear reason as to why this is the case, it could be due to older adults buying larger bottles of liquor on occasion and going through them slowly overtime, whereas college students might tend to buy smaller bottles of liquor more frequently for parties and such.

Product Distribution Data

data

Next I sought to determine if there is a difference in the types of liquor that are stocked in college town liquor stores vs non-college town liquor stores. The boxplot above depicts the spread of the percentage of liquor store orders that each individual type of liquor takes up in both kinds of towns. A simple glance at the plot reveals that some liquor categories appear to have the same or similar trends in both types of towns, but some also appear different.

I again conducted T-tests for each liquor category to quantify whether or not these differences were significant. This resulted in failing to reject the null hypothesis in some categories, and rejecting the null hypothesis in others, meaning that certain liquors see the same buying patterns in college and non-college towns, whereas others differ significantly. Of those that were different, brandy, gin, tequila, and vodka were all more popular in college towns, and rum and whiskey were more popular in non-college towns. Cocktails, liqueurs, mezcal, and unknown liquors did not see any significant differences.

Price Distribution Data

To analyze the price distribution, I split the data up into two groups: liquor bottles under $100, and liquor bottles over $1,000. The first group contained the vast majority of the data, as most people do not buy luxury spirits, and thus showcased the buying patterns of the bulk of the population. The second group showcases the most expensive spirits stocked in liquor stores, and gives us a deeper look at buying patterns for expensive liquors.

Liquor under $100

data

As previously stated, the vast majority of the liquor orders were for bottles priced under $100. The plot above shows the price densities for both college and non-college towns, and we can see that the two densities are almost identical. The same patterns are occurring for both types of towns, with peaks and valleys happening at the same price values.

The only noticeable difference is in the sharpness of the peaks; non-college towns see higher peaks that hit a sharper point, whereas college towns aren't spiking up as high and its peaks are generally more rounded. This difference could potentially be due to more variety in college towns; if college students are buying a slightly larger variety of products, then the price data won't be quite as consolidated around specific price points.

Liquor over $1,000

data

The price range represented in the violin plot above, liquor bottles over $1,000, represents the most expensive bottles of liquor ordered. The gap between this group and the previous group, liquor bottles under $100, was almost entirely empty for college towns.

Overall, buying expensive liquor was much less common in college towns, and there is also much less variance in the prices of expensive liquor being purchased in college towns. We see this in the graph above, where the college towns violin plot is much more condensed, while the non-college towns violin plot is more spread out and sees two separate peaks, as opposed to one in college towns.

Seasonality

In general, most products typically do not sell at the same rate throughout the year: water guns and ice cream sell more in the summer, ice scrapers and hot chocolate sell more in the winter, and so on. Liquor products are no different. To analyze the seasonal patterns in Iowa liquor stores, I compared both product and price seasonality in college and non-college towns.

Product Seasonality

data

data

The line graphs above depict the bottles of each type of liquor sold per month from January 2018 through February 2022 in college and non-college towns. It is important to note that the y-axis in the non-college towns graph is an order of magnitude larger than in the college towns graph. Both types of towns are seeing the same major annual peak: December (on both graphs, the blue vertical lines are dated December 31, representing the sum of all sales throughout the month of December). However, this does not mean that there aren't other seasonal trends for certain kinds of liquor.

Next, we will examine three particular liquors that saw different annual trends and analyze their meaning.

One Peak: Liqueurs

data

Liqueurs have one major peak every year: a holiday spike in December, which coincides with the spike in the previous graph of all of the liquor products. However, if we look more closely at the line graph for college towns, we see that there is a smaller spike in the Fall every year that precedes the December spike. This pattern is not occurring in non-college towns. The difference in these patterns is most likely because of Halloween, which is much more of a drinking holiday in college towns than it is in non-college towns.

Multi-Peak: Vodka

data

For vodka, both college and non-college towns see the same holiday spike in December that we've consistently seen for other liquors. In addition, there is a Fall spike in either September or October, depending on the year. However, the behavior of this spike differs depending on the type of town. In college towns, the Fall spike is typically larger than the December spike, but in non-college towns, it is usually smaller or nonexistent. This is most likely caused by the same reason that we saw for liqueurs: college towns partying more for Halloween.

Covid Peak: Cocktails

data

Cocktails saw a significant rise in liquor store orders in April and May of 2020, which marked the beginning phase of the Covid-19 pandemic. This was most likely caused by the lockdowns that were in place across the country; if you're at home by yourself, maybe attending a Zoom happy hour, it's much easier to crack open a ready-made cocktail than it is to prepare a drink on your own. Even after the peak came back down, orders for cocktails were still notably higher than they were beforehand. Aside from the Covid spike, there are two major annual peaks: a Spring/Summer peak, and a holiday peak in December.

Price Seasonality

data

Lastly, we will analyze the average price of bottles that liquor stores ordered over time in order to study price seasonality. The green vertical line marks March 31, 2020. Since the data is grouped by month, March 2020 represents the last month that contained any pre-Covid data. Before the pandemic began, both college and non-college towns follow the same patterns and shape, and go back and forth between which one is averaging slightly higher than the other. After the pandemic began, college towns break off and are averaging consistently higher bottle prices than non-college towns.

One possibility as to why this may be is that when the lockdowns began, college students all went home. If they were buying cheaper alcohol than the rest of the town residents, then the average prices of liquor being purchased would go up once they left.

We also see that in the Fall of 2021, the gap between college and non-college towns narrows, which could have been caused by college students moving back in and thus driving the average prices down a little bit. Another important pattern to notice is that there seems to be a consistent spike in average prices in December, most likely due to customers buying more expensive liquor as gifts or for holiday parties.

Key Takeaways

After performing the analysis above, our key findings were as follows:

  • Volume Distribution:
    • Mean bottle volumes differ significantly between college and non-college towns in every liquor category
    • Non-college towns tend to buy larger bottles of liquor
  • Product Distribution:
    • Certain liquors are more popular depending on the type of town
    • Brandy, gin, tequila, and vodka are more popular in college towns
  • Price Distribution:
    • Liquor under $100: college towns and non-college towns follow the same patterns
    • Liquor over $1,000: less common and less spread out in college towns
  • Seasonality:
    • Main peak: December
    • Some liquors see other seasonal peaks
    • Covid-19 has had an impact on the price trends and sales in certain liquor categories

 

GitHub | LinkedIn

About Author

Julianna Douglas

Data analyst with a background in mathematics, strengthened by a variety of coding languages, a knack for problem-solving, and a passion for communication. Highly motivated to use data analysis in an interdisciplinary way to bring about positive change.
View all posts by Julianna Douglas >

Leave a Comment

No comments found.

View Posts by Categories

All Posts 2399 posts
AI 7 posts
AI Agent 2 posts
AI-based hotel recommendation 1 posts
AIForGood 1 posts
Alumni 60 posts
Animated Maps 1 posts
APIs 41 posts
Artificial Intelligence 2 posts
Artificial Intelligence 2 posts
AWS 13 posts
Banking 1 posts
Big Data 50 posts
Branch Analysis 1 posts
Capstone 206 posts
Career Education 7 posts
CLIP 1 posts
Community 72 posts
Congestion Zone 1 posts
Content Recommendation 1 posts
Cosine SImilarity 1 posts
Data Analysis 5 posts
Data Engineering 1 posts
Data Engineering 3 posts
Data Science 7 posts
Data Science News and Sharing 73 posts
Data Visualization 324 posts
Events 5 posts
Featured 37 posts
Function calling 1 posts
FutureTech 1 posts
Generative AI 5 posts
Hadoop 13 posts
Image Classification 1 posts
Innovation 2 posts
Kmeans Cluster 1 posts
LLM 6 posts
Machine Learning 364 posts
Marketing 1 posts
Meetup 144 posts
MLOPs 1 posts
Model Deployment 1 posts
Nagamas69 1 posts
NLP 1 posts
OpenAI 5 posts
OpenNYC Data 1 posts
pySpark 1 posts
Python 16 posts
Python 458 posts
Python data analysis 4 posts
Python Shiny 2 posts
R 404 posts
R Data Analysis 1 posts
R Shiny 560 posts
R Visualization 445 posts
RAG 1 posts
RoBERTa 1 posts
semantic rearch 2 posts
Spark 17 posts
SQL 1 posts
Streamlit 2 posts
Student Works 1687 posts
Tableau 12 posts
TensorFlow 3 posts
Traffic 1 posts
User Preference Modeling 1 posts
Vector database 2 posts
Web Scraping 483 posts
wukong138 1 posts

Our Recent Popular Posts

AI 4 AI: ChatGPT Unifies My Blog Posts
by Vinod Chugani
Dec 18, 2022
Meet Your Machine Learning Mentors: Kyle Gallatin
by Vivian Zhang
Nov 4, 2020
NICU Admissions and CCHD: Predicting Based on Data Analysis
by Paul Lee, Aron Berke, Bee Kim, Bettina Meier and Ira Villar
Jan 7, 2020

View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day ChatGPT citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay football gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income industry Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI

NYC Data Science Academy

NYC Data Science Academy teaches data science, trains companies and their employees to better profit from data, excels at big data project consulting, and connects trained Data Scientists to our industry.

NYC Data Science Academy is licensed by New York State Education Department.

Get detailed curriculum information about our
amazing bootcamp!

Please enter a valid email address
Sign up completed. Thank you!

Offerings

  • HOME
  • DATA SCIENCE BOOTCAMP
  • ONLINE DATA SCIENCE BOOTCAMP
  • Professional Development Courses
  • CORPORATE OFFERINGS
  • HIRING PARTNERS
  • About

  • About Us
  • Alumni
  • Blog
  • FAQ
  • Contact Us
  • Refund Policy
  • Join Us
  • SOCIAL MEDIA

    ยฉ 2025 NYC Data Science Academy
    All rights reserved. | Site Map
    Privacy Policy | Terms of Service
    Bootcamp Application