Studying Data and Exploring Food Across the World

Jonathan Liu

Posted on Aug 7, 2016

The skills the author demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.

EDIT: Several updates and changes have been made after this blog post has been posted.

Please visit https://jonathanliu.shinyapps.io/FoodExplorerV5/ for the latest version of my App.

Source Codes at my Github has also been updated.

Dataset used in this App can be found HERE on Open Food Facts website

Introduction

Traveling abroad sometimes can be painful, especially when you are trying to keep your regular diet. Imagine you have to spend a few days in Russia, and you do not speak Russian, what should you eat? Well, McDonalds can be a safe choice, but what if you data cannot provide a familiar restaurant or you want to keep your healthy diet?

[Figure 1. Open Food Facts]

Open Food Facts might be useful in helping you find out what to eat. It is an open source database that started in France, which provides nutrition facts for food products sold in each country, and all nutrition information was available in English. This can be very helpful to foreigners to understand food in this country. Currently, this database contains more than 90,000 records, all of the contents are contributed by volunteers, and its entire database can be downloaded for free.

With that being said, however, there a few issues that causing this great website to be less accessible. For example, although the search engine provided by Open Food Facts website gives many options to dig into the dataset, its user interface is not very user-friendly. Users have to switch back and forth between different result pages to check details of products, which makes the searching quite inefficient. Therefore, in an attempt to build a more efficient explorer for the Open Food Fact database, I have made a World Food Explorer with RStudio’s Shiny web application framework.

Overview

[Figure 2. Overview]

As shown in Figure 2, the first page of this web app shows a brief overview of all food records within the database. As this summary demonstrates, although there are more than 14 countries included in this database, the majority of product records in the database come from France and other European countries; while users from United States, Canada, Brazil, and Australia also contributed thousands of records. This is understandable since this project was originally founded in France.

Data Explorer

[Figure 3. Country Selector and Nutrition Filters]

The explorer page provides an interface to dig into the dataset. The Nutrition Filters box on top of the page (collapsed by default) provides slider-form filters for 10 major nutritional elements that exist in food products, including total calories, Carbohydrates, Sugar, etc. After selecting a target country on the top of the page, then check interested nutrition and change the range of filtering, this app will filter all the products based on users’ choices.

[Figure 4. DataTable]

Meanwhile, the Matching Items box lists all the products that match the filtering criteria and display corresponding nutritional facts along with the product names and their packaging barcodes. Users can also click the “Info” button on the right side of each record to browse the detail page on the Open Food Facts website.

[Figure 5. Scatter Plot]

At the bottom of Explorer page users can find another box called Correlation Between Nutrition. This box projects all the food records in the Matching Items box into a scatterplot, showing the relationship between two selected nutrition items. This chart reveals some very interesting relationships.

For example, Figure 5 shows a scatterplot between sugar and Carbohydrate per 100 gram of product, and immediately you can identify the clear correlation on x = y boundary. What does this line tell us? It means within many food products, sugar is the only carbohydrate that is included! That sounds scary, but what on earth are those items? By narrowing down nutrition filters, the app allows users to further investigate this sweet list.

[Figure 6. High Sugar Items]

Ha! As shown in Figure 6, it turns out that products with the extremely high volume of sugar are really just pure sugar or syrup. Feel relieved now? Well, let’s look further.

[Figure 7. High Sugar Items - Below 50g]

When we lower the sugar limit further to below 50 grams per 100 grams of product, we finally find dentists’ top enemies – chocolate, sugary drinks. All those sweet devils are hiding in this range!

Summary

[Figure 8. Summary Tab]

Now, after exploring different types of food and nutrition items, users might want to save a list of items for further consideration.

This is what Summary tab is for. When users want to save an item in the food in Explorer tab, they can select items in the data table, and then click “Add Selections to Summary” button on the top right of the data table. The number of saved items are shown in both the sidebar and the Explorer tab, and users can reset this list anytime by clicking the “Reset Selections” button in the sidebar.

Saved items can be reviewed in Summary tab; in addition, this tab also calculates the average nutrition level of all selected items, compares it with the FDA’s suggested Daily Value (DV) of each nutritional element, and shows the average DV% that each item in the list contains. If the average DV% exceeds 50%, the corresponding info boxes will turn into yellow as warnings.

End Note

This Shiny App is designed as a replacement for the "unfriendly" user interface on the Open Food Facts website. While it improves users' experience, there is still more that can be done. for example, a download button might be useful for users to download their selected items. If you are interested, please feel free to fork my source code at my Github.

About Author

Jonathan Liu

Through years of self-learning on programming and machine learning, Jonathan has discovered his interests and passion in Data Science. With his B.B.A. in accounting, M.S. in Business Analytics, and two years of experience as operation analyst, he is...

View all posts by Jonathan Liu >

Cancel reply