wine dApp: Wine Recommendation and Data Analysis Web App
Project GitHub | LinkedIn: Niki Moritz Hao-Wei Matthew Oren
The skills we demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.
About
Don't wind up with the wrong wine at the wrong time: unwind with the world's best!
winedApp provides insights into prices, ratings, descriptions, and geographic distribution of the world's most esteemed wines. Novice or connoisseur, consumer or seller, this app will meet your oenophile needs.
In the Wine Explorer, you can enter your location, varietal, aroma, taste, vintage, and price range preferences, and retrieve information on compatible wines.
The Global Insights feature offers map visualizations of international wine trends.
Graphs and Charts provides additional lenses into relationships amongst countries of origin, varietals, prices per bottle, and ratings.
The data was sourced from ca. 36,000 wine reviews on the WineEnthusiast site. In this dataset, 145 varietals from 36 countries and 23 US states are represented.
Further information was extracted from the Wikipedia list of grape varieties.
General Trends
- Wine prices (per bottle) and points awarded do not show a strong positive correlation (for certain countries, there is even a negative correlation).
- The US, Italy, and France are by far the most represented in the dataset.
- The most represented varietals are Chardonnay (10, 996 entries) and Cabernet Sauvignon (9,058 entries).
- Most wines fall within the range of $4-50 per bottle, but the distribution is right-skewed. The full price range is $4-2,013.
- Varietals vary considerably with respect to point and price ranges, as well as country distribution.
- There is a statistically significant difference between average prices per bottle for red vs. white wines ($42.47 and $30.51, respectively, with p << 0.05).
- There is also a statistically significant difference regarding average point values for reds vs. whites (88.43 and 88.29, respectively, with p << 0.05, on an 80-100 point scale).
Future Work
Features will be added and refined on a continuous basis. Any suggestions are welcome.
Current objectives include:
- Testing the app on a larger dataset;
- Enabling the user to determine if a given wine is available in their local area;
- Building a price predictor.
Technical Details
Web scraping was completed using the CRAN rvest package. The list of descriptive keywords featured in the Wine Explorer menu item was generated using the nltk (Natural Language Toolkit) in Python. All other content was produced via the R Shiny Dashboard library and associated data visualization packages. To view the app code, please visit this Github repository.