Data Study on Coffee: World Markets and Trade

Posted on Aug 5, 2016
The skills the author demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.
Contributed by Chia-An Chen (Anne Chen). She is currently in the NYC Data Science Academy 12-week full time Data Science Bootcamp program taking place between JulyΒ 5th to SeptemberΒ 23rd, 2016. This post is based on her second project - R Shiny (due on 4th week of the program). The R code can be found on GitHub.

A morning without coffee can be devastating...

But have you wondered where coffee comes from? Or how much coffee people consume? Coffee is not only a popular drink that refreshes people, but data shows it's one of the most important commodities in agricultural economies. The global coffee trade appΒ I made visualizes the trading data of coffee around the globe from 1960 to 2015.

 

Overview of the App

The app has five tabs. β€œGeneral View” visualizes the trading data as a world map and provides the top ten rankings for the trading category of choice. β€œTrend” allows users to compare different categories of trading in a specific country, and shows how the attributes change over time. The β€œCorrelation” tab plots two variables of interest and checks if there is a correlation. β€œFun Facts” compares annual working hours and coffee consumption in different countries to identify if people drink more coffee per hour they work. β€œInfo” documents how the data was processed and where the datasets were downloaded.

 

Data Findings

From 2010 to 2015, the European Union and USA consumed the most coffee on average relative to other countries. However, if we normalize the amount of consumption by population, Switzerland and Norway were the top two nations for average domestic consumption.

Exports in the World

Data Study on Coffee: World Markets and TradeOn the other hand, while Brazil and Vietnam rank as the top two exporters on average from 2010 to 2015, if we normalize the amount of coffee exported with GDP in USD, Honduras and Nicaragua are the top two countries. This finding may indicate that coffee plays a relatively vital role in economy in those countries that rank high in exporting coffee (kg/GDP).

European Union and Switzerland

Data Study on Coffee: World Markets and Trade

The European Union relies solely on importsΒ as the source for domestic consumption(red). Whereas in Switzerland, a portion of imported coffee is roasted and ground for exporting(yellow).

Ecuador and Thailand

Data Study on Coffee: World Markets and Trade

The total distribution seems to be shaped by different attributes depending on the countries. In Ecuador, the distribution once relied on local production but has largely relied on import in recent years. While in Thailand, the distribution of coffee fluctuated with bean exports but has shifted to soluble exports lately.

Brazil and Democratic Republic of Congo

bra_con

There are two primary types of coffee, Arabica and Robusta, which require different growing environments. Countries in South America, like Brazil, usually produce Arabica. And countries in Africa, such as the Democratic Republic of Congo, produce Robusta in general.

 

corr

By looking at the data with no zero values in 2015, we discover that there is a fairly strong correlation between Imports and Domestic Consumption, and Total Distribution and Production. Note that the dashed lines are the average of each attribute, and dots at zero means that those countries do not either import or produce coffee beans at all.

Surprisingly, there are a few countries that show zero in domestic consumption, which implies that the people in those countries did not drink coffee at all in 2015. These countries are mostly the ones located in Africa, such as Benin, Liberia, Republic of Congo, and Zimbabwe…etc. For the full list of countries under this scenario, please visit the app and zoom in to the bottom left.

Working Hours vs Coffee Consumption

coffee fun facts

To prevent the data points normalized by population from crowding towards the bottom left of the plot, an amplifying factor of 1,000,000 was applied to the y axis, which is the annual working hour. This bubble chart implies that Norway and Switzerland work less and drink more, while Costa Rica and New Zealand may be the opposite.

Interestingly, since the size of the bubble is proportional to the coffee consumed relative to working hours, we can see that the USA consumes a lot more coffee than other countries per hour worked. It would appear that in some countries, like Norway and Switzerland, coffee is a drink of leisure; while for other countries, like the US, coffee is used as a way to kick start work.

Conclusion

(Potential) Future update for this app: Implement another dataset that lists the price paid to coffee bean growers, and see if the grower’s earnings are related to the GDP.

Hopefully you can find something that interests you after playing around the app. πŸ™‚


Code at a glance (click here for the fullΒ R code):

About Author

Chia-An (Anne) Chen

Anne Chen has a Masters degree in Bioengineering from the University of Pennsylvania. Prior to working at a biotech startup developing a liver cancer diagnosis device, Anne researched and evaluated open-source Electronic Health Records software for small-scale hospitals...
View all posts by Chia-An (Anne) Chen >

Related Articles

Leave a Comment

No comments found.

View Posts by Categories


Our Recent Popular Posts


View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI