Coffee: World Markets and Trade

Posted on Aug 5, 2016

Contributed by Chia-An Chen (Anne Chen). She is currently in the NYC Data Science Academy 12-week full time Data Science Bootcamp program taking place between July 5th to September 23rd, 2016. This post is based on her second project - R Shiny (due on 4th week of the program). The R code can be found on GitHub.


A morning without coffee can be devastating...

But have you wondered where coffee comes from? Or how much coffee people consume? Coffee is not only a popular drink that refreshes people, but one of the most important commodities in agricultural economies. The global coffee trade app I made visualizes the trading data of coffee around the globe from 1960 to 2015.


Overview of the App

The app has five tabs. “General View” visualizes the trading data as a world map and provides the top ten rankings for the trading category of choice. “Trend” allows users to compare different categories of trading in a specific country, and shows how the attributes change over time. The “Correlation” tab plots two variables of interest and checks if there is a correlation. “Fun Facts” compares annual working hours and coffee consumption in different countries to identify if people drink more coffee per hour they work. “Info” documents how the data was processed and where the datasets were downloaded.


Let’s dig into some interesting findings!

From 2010 to 2015, the European Union and USA consumed the most coffee on average relative to other countries. However, if we normalize the amount of consumption by population, Switzerland and Norway were the top two nations for average domestic consumption.


exOn the other hand, while Brazil and Vietnam rank as the top two exporters on average from 2010 to 2015, if we normalize the amount of coffee exported with GDP in USD, Honduras and Nicaragua are the top two countries. This finding may indicate that coffee plays a relatively vital role in economy in those countries that rank high in exporting coffee (kg/GDP).



The European Union relies solely on imports as the source for domestic consumption(red). Whereas in Switzerland, a portion of imported coffee is roasted and ground for exporting(yellow).



The total distribution seems to be shaped by different attributes depending on the countries. In Ecuador, the distribution once relied on local production but has largely relied on import in recent years. While in Thailand, the distribution of coffee fluctuated with bean exports but has shifted to soluble exports lately.



There are two primary types of coffee, Arabica and Robusta, which require different growing environments. Countries in South America, like Brazil, usually produce Arabica. And countries in Africa, such as the Democratic Republic of Congo, produce Robusta in general.



By looking at the data with no zero values in 2015, we discover that there is a fairly strong correlation between Imports and Domestic Consumption, and Total Distribution and Production. Note that the dashed lines are the average of each attribute, and dots at zero means that those countries do not either import or produce coffee beans at all. Surprisingly, there are a few countries that show zero in domestic consumption, which implies that the people in those countries did not drink coffee at all in 2015. These countries are mostly the ones located in Africa, such as Benin, Liberia, Republic of Congo, and Zimbabwe…etc. For the full list of countries under this scenario, please visit the app and zoom in to the bottom left.


coffee fun facts

To prevent the data points normalized by population from crowding towards the bottom left of the plot, an amplifying factor of 1,000,000 was applied to the y axis, which is the annual working hour. This bubble chart implies that Norway and Switzerland work less and drink more, while Costa Rica and New Zealand may be the opposite. Interestingly, since the size of the bubble is proportional to the coffee consumed relative to working hours, we can see that the USA consumes a lot more coffee than other countries per hour worked. It would appear that in some countries, like Norway and Switzerland, coffee is a drink of leisure; while for other countries, like the US, coffee is used as a way to kick start work.


(Potential) Future update for this app: Implement another dataset that lists the price paid to coffee bean growers, and see if the grower’s earnings are related to the GDP.

Hopefully you can find something that interests you after playing around the app. 🙂

Code at a glance (click here for the full R code):

About Author

Chia-An (Anne) Chen

Anne Chen has a Masters degree in Bioengineering from the University of Pennsylvania. Prior to working at a biotech startup developing a liver cancer diagnosis device, Anne researched and evaluated open-source Electronic Health Records software for small-scale hospitals...
View all posts by Chia-An (Anne) Chen >

Related Articles

Leave a Comment

No comments found.

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 airbnb Alex Baransky alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus API Application artist aws beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep Bundles California Cancer Research capstone Career Career Day citibike clustering Coding Course Demo Course Report D3.js data Data Analyst data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization Deep Learning Demo Day Discount dplyr employer networking feature engineering Finance Financial Data Science Flask gbm Get Hired ggplot2 googleVis Hadoop higgs boson Hiring hiring partner events Hiring Partners Industry Experts Instructor Blog Instructor Interview Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter lasso regression Lead Data Scienctist Lead Data Scientist leaflet linear regression Logistic Regression machine learning Maps matplotlib Medical Research Meet the team meetup Networking neural network Neural networks New Courses nlp NYC NYC Data Science nyc data science academy NYC Open Data NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time Portfolio Development prediction Prework Programming PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn Selenium sentiment analysis Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau team TensorFlow Testimonial tf-idf Top Data Science Bootcamp twitter visualization web scraping Weekend Course What to expect word cloud word2vec XGBoost yelp