USA x China: Who is winning the commodity trade war?

Posted on May 7, 2018


The United States has the title of world's biggest economy for a long time. However, China's economy has been growing at a pace that is threatening that leadership. The Gross Domestic Product (GDP) in the US was worth 18.57 trillion dollars in 2016 while China's GDP was worth 11.2 trillion dollars in the same period.

Recently, the Trump Administration placed tariffs on Chinese products like flat-screen televisions, medical devices, and others. The Chinese counterattacked placing tariffs on products like soybeans and pork. The exportation and importation of these products can have a direct impact on the GDP.

This project is designed to analyze and visualize the commodities exportation and importation around the world with a special focus on China and United States.


The dataset is from the United Nations Statistics Division and covers import and export trade values in USD for 5,000 commodities across most countries on Earth over the last 30 years. The size of the file is 1.25GB and a preprocessing step was necessary to reduce that size.

My first attempt was to create an image of the database inside R that reduced the size of the file to less than 100MB. However, the time to load the data from that file was unfeasible. The next logical attempt was to migrate to SQL database. However, after creating the table, the file size was still greater than 1GB and required additional manipulation. The following steps were made:

  • Drop unused columns
  • Normalize the database
  • Tune the database

The first step is straightforward since I had no need to store values that are not going to be used. The next step was necessary because there were two text columns (commodity and category) that had repeated values all over the dataset. Considering that text usually needs more memory space than an integer, two new domain tables were built to store the text values and the unique identifier created was referenced in the main table. For the last step, primary and foreign keys were created to improve the queries performance. The figure below shows the difference between the original and final database version:


This application has multiple tabs, each one offering a different approach to how we can compare the commodities.

The Bar Graph tab will allow us to choose a country and see what commodities most affect the exportation and importation total amount. For China, the total export trade amount was more than US$ 750 billion. The top 10 categories and the percentage that those ten categories influenced by the total amount are highlighted below.

The Map section, in opposite to the first tab, will give you the opportunity to compare countries trade value for a specific commodity. The map graph below was generated by calculating the balance (export-import) with all commodities taken into consideration. We can see that China and US are polar opposites in terms of trade values. China shows a higher balance trade value, while the US shows a higher negative trade value.

The last visualization option is a bubble chart. The graph enables a user to see the behavior of two commodities over time by selecting more than one country and the flow (Export or Import). Some insights can be extracted from the graphs below. We see that in ten years China has exceeded the US in exportation. Also, in 2016, China was still leading the overall commodities exportation. One curious bit of information that is shown in this graph is that in 2009 all the countries had reduced the trade value amount exported. This was probably because of the 2009 global financial crisis that likely diminished the number of products exported or the value of each commodity.


Clearly, we can see that China is winning the war trade in commodities. China has a positive balance in contrast to the US, and that helps in the gross domestic product calculation. However, the trade balance is only one component in a countryโ€™s economy estimate.

This app is very flexible and generic in a sense that analysis on other countries could be done as well.

App in ShinyIO

Code in GitHub

About Author

Guilherme Strachan

Guilherme Strachan is a software developer but making his way to Data Science field. He has a Master Degree in Electrical Engineering with an emphasis in Computational Intelligence. He is skilled in problem solving, machine learning models and...
View all posts by Guilherme Strachan >

Related Articles

Leave a Comment

No comments found.

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI