Data Visualization on Migration Patterns in Europe
The skills the author demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.
This project attempts to visualize the migration patterns followed by over a million migrants in the last 18 months, by means of an interactive map developed in "Shiny". While offering a dynamic picture of the migration flow through Eastern Europe, from Greece through the Balkans, up to Austria, the project aims at analyzing its composition, in terms of country of origin and gender.
According to the United Nations High Commissioner for Refugees (UNHCR), data shows the total number of refugees globally accounted for in 2016, is estimated to 14.5 million people. When internal displacements and "stateless" individuals are considered, the total population of concern reaches a shocking 58 millions, largely located (~75%) in Africa and Asia. While it is hard to conceive such a huge flux of people fleeing war, poverty or persecution in their country of origin, the consequences of such displacements have recently become a central topic of debate within European Union.
Over the last 8 years, more than 1.7 million migrants reached Southern Europe, either through Turkey or crossing the Mediterranean Sea. At the same time, another 2.8 millions registered Syrian refugees are currently located in Turkey.
Browsing the data in the interactive map
The UN refugee agency is continuously collecting the daily arrivals per country, allowing a precise mapping of the migration flow and therefore supporting the emergency response plan. While the project focuses on the arrivals, gender and origin recorded between October 2015 and June 2016, the complete database can be found here.
As shown in Fig. 1, the Shiny application appears as a dashboard, where the sidebar is used for the navigation while contents are displayed in the main tab.
The balkan route represents the daily arrivals, combining the visualization of the daily arrivals on a map (either as single frame or as animation) with a time series for each country. As expected, the flow is rather discontinuous, with a number of "spikes" propagating from Greece to Macedonia (FYROM), Serbia, Croatia and Austria. The flow of migrants splits between Slovenia and Hungary up to mid 2015, when the latter closes the border forbidding any further access. A similar policy is applied on Albania and Montenegro. Despite such limitations the flow does not seem to be stopped. A comparison between the 6 countries involved, shows basically the same trend.
As mentioned in the previous section, the second and most dangerous route towards Europe crosses central Africa and the Mediterranean Sea. If the balkan route is quite well defined, both geographically and ethnically, the latter is much more complex, as it entails most north African countries, from Morocco to Egypt, and multiple destinations such as Spain, Malta and Italy.
In the last 18 months, about 100.000 migrants reached the Italian coast mostly from Algeria, Libya and Egypt. Surprisingly, only 25% of the arrivals are refugees while the majority comes from Nigeria, Eritrea, Gambia, Cote d'Ivoire and several other countries. The difficulties and risks associated with the African route have a clear effect on gender distribution: women and children account only for 26% of the total arrivals in Italy, against the 48% estimated in Greece. Overall, both in 2015 and 2016 the number of registered minors resulted larger than the number of women, for a total of 300.000 arrivals.
Due to the time constraints of the project, the application is mostly focusing on the Balkan route and on "hosting countries". Furthermore, data are dowloaded and processed directly, without exploiting the flexibility offered by the UNHCR API. In the next future the app will be completed by merging the two routes in one single map and updating the underlying statistics in real time. Furthermore, I would like to develop a similar map for the "countries of origin" in order to provide a complete migration pattern, from the country of origin to the actual destination.
Appendix: Developing the Shiny App
In the final section, I will briefly describe the essential steps taken during the development of the web application. As anticipated in the title, I used the Shiny Dashboard framework for R
Building the Table
One of the first steps in the development of the app was to build a reactive table depending on two inputs: a chosen dataset and a given number of columns (allowing multiple choices). As the first input influences the available choices for the second input, I used a reactive observer. In contrast with the usual reactive expression (using lazy evaluation), observers execute their content as soon as their dependencies change (i.e. they use eager evaluation).
Maps and time series
Once the datasets is available and processed, the main map was created. In order to achieve that I used the package plotly for the world map and dygraph for the time series. The latter allows to compare arrivals over time or to visualize a single country over a certain number of periods. The sample frequency (i.e. the smoothness of the curves) can be set by the user.
View Github: Github
Written in R, using R studio. Deployed using ShinyIO.
Contributed by Diego De Lazzari. He is currently in the NYC Data Science Academy 12-week full time Data Science Bootcamp program taking place between July 5th to September 23rd, 2016. This post is based on his second project - R Shiny (due on 4th week of the program). The R code can be found on GitHub while the App is stored on Shinyapps.io.