Whiskey Advocate Data

Posted on Nov 23, 2021
The skills the author demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.
GitHub|LinkedIn

BACKGROUND AND INSPIRATION

Over the years, I’ve tasted a very limited selection of whiskies, and they have tended to be the usual common brands such as Johnnie Walker and Glenlivet. I've always wondered, is there a strong relationship between price and quality? A whiskey bottle could set you back $10 and in some cases even $100,000 . Whiskey prices are dictated by supply, demand, age, brand, etc. My analysis through the data gathered will focus on price and rating.

I was able to web scrape a website called Whiskyadvocate. Whiskyadvocate is America's leading whisky publication. It's a premier source for whisky information, education and entertainment for whisky enthusiasts. There are over 5,000+ whiskey reviews and its this data I plan to use.

PROJECT GOALS

  • Is there a strong relationship between price and rating?
  • Select top rated whiskies that costs less $ 150

DATA DICTIONARY

  • Category – Whiskey category
  • Brand – Whiskey brand
  • Title – Whiskey title
  • Alcohol Percentage- Alcohol Content
  • Price: Price of the whiskey bottle (USD)
  • Reviewer: Reviewed by
  • Review: Rating (out of 100):
    • 95-100 points—Classic: a great whisky
    • 90-94 points— Outstanding: a whisky of superior character and style
    • 85-89 points—Very good: a whisky with special qualities
    • 80-84 points—Good: a solid, well-made whisky
    • 75-79—Mediocre: a drinkable whisky that may have minor flaws
    • 50-74—Not recommended

Whiskey Advocate Data

Note - Data outside the 95% confidence interval were removed due to price outliers

DATA ANALYSIS

Price vs. Review Data

Does price yield a better overall rating? As expected, as you increase price, the overall rating increases:

Whiskey Advocate Data

Review Distribution Data

We can see the reviews are distributed around 85-95 out of 100. In general, if you were to pick a whiskey bottle randomly you would most likely pick a bottle that reviewed well. Note - The PDF is skewed to the left.Whiskey Advocate Data

Data of Top 15 Whiskey Categories

Lets now focus on the top 15 categories - This shouldn't come as a surprise, there's a large number of different single malts (Scotch) produced in excess of 2,000 bottles, followed by Bourbon, and Rye.

Whiskey Advocate Data

Average - Category Review vs Price Top 15

  • The 'Irish Single Pot Still' category on average scored the highest review at a price of < $200 and a score of 92 out of 100.
  • Japanese whiskies are ranked second which is no surprise due to its high popularity.
  • However, I would have expected a Single Malt Scotch in the top 15 considering there are over 2,000 different whiskies in this category.

Price (<$150) Box Plot Top 15

  • Single Malt Scotch and American Whiskies cost on average of ~$100.
  • Canadian whiskies were priced at $65.
  • Blended Malt Scotch Whiskies contained the greatest price dispersion.

Review  (<$150) Box Plot Top 15

  • Bourbon/Tennessee on average yielded a greater review of 93/100 vs Irish/American Single Malts reviewed less favorably.
  • Before we highlight a final selection of top whiskies, lets calculate look at the review points per $1 spent.

Whiskey Advocate Data

Review Data Points/$1 Top 15

  • For every $1 spent, English Grain Whiskey yielded 1.4 review points.
  • Bourbon yielded 1.15 review points for every $1 spent.
  • For every $1 spent, Irish single malts yielded 0.29 review points.
  • While Japanese whiskies are popular, you would need to spend more for a good bottle – 0.43 points per dollar spent.

Whiskey Advocate Data
CONCLUSION

  • Price does indeed yield a highly rated whiskey.
  • The top whiskies of choice purely based on price/review were two Bourbons:
    • Parker's Heritage Collection, 'Golden Anniversary - $150 – Scored 97/100.
    • Four Roses Limited Edition Small Batch (2013 Release) - $85 – Scored 97/100.

FUTURE WORK

  • To expand this data analysis, I would extract key terms from the actual review vs the overall score. This would help drive further insight and develop a more meaningful analysis.
  • The overall conclusion, assumes all whiskies are the same. However, there are many differences between bottles within a category and across categories. For example, Bourbons are sweet and a subset of Scotch whiskies are smoky in flavor. This analysis needs to be refined to capture these differences, and to identify top picks unique to the reader.
  • As a closing comment, the prices in this data set needs to be refreshed as they are stale. For example, the Parker's Heritage Collection, 'Golden Anniversary' whiskey bottle costs $4,000 and not $150.

About the Author

I am currently a Director at RBC working in the Equities Derivatives Technology Group. I graduated with a BSc in Computer Science from University of College London and recently completed my MBA at Chicago Booth. I am keen to explore the data science world and create actionable insight.

About Author

Related Articles

Leave a Comment

No comments found.

View Posts by Categories


Our Recent Popular Posts


View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI