Explore Global Insights in a World Atlas Shiny App
In April of 2023, Posit released Shiny for Python. In doing so it extended support for building apps in both R and Python, offering tools and packages for Python and Julia and other languages. While that introduction was both exciting and potentially helpful, it was a challenge to optimize its value. That is due to the fact that there are not many example apps to look at for inspiration and the near absence of any community discussion online. To make the app, I first had to find an interesting and robust dataset to work with. I chose a Global Country Information Dataset and imagined what a nice Python Shiny app might look like.
Inspired by the viral TEDTalk by Hans Rosling, The Best Stats You've Ever Seen, this World Atlas app was created to help users learn about the world visually. It is designed for users to select and learn about the countries and socioeconomic indicators they're most interested in from two standpoints: an interactive world map and a variety of graphs that show how the world has changed over time. Many thanks to Khuzaima Shahid, Adriana Echeverri Romero, and Vinod Chugani for their mentorship in the process of analyzing the dataset and building the app.
Challenges of the project
In building the Python-Shiny app, you might assume (as I did) that grouping data for countries and regions in an accurate and precise way would be easy, but it wasnโt very simple to interweave and conceptualize them. Country names have spelling variations, and a lot of the data had to be manually cleaned.
In doing some exploratory data analysis, I realized that for the type of visualizations I wanted to create, Iโd have to spend time merging different data frames that included the historical data for each country that would have its socioeconomic factors plotted. The next step was merging these datasets with additional information like ISO Alpha codes that recognize a countryโs location on an interactive Plotly map, plus commonly used Regions (APAC/AMER/EMEA) and a dataset with all six continents labeled. Adding ISO codes, regions, and continents allowed for more creativity in visualizing each countryโs socioeconomic data for the past two decades.
Other challenges included working with different types of numerical data, and ranges of numerical data that were much smaller or larger in certain cases. To visualize the data correctly, the data had to be matched with the correct types of graphs and other variables. The information represented in each column of the data frames varies distinctly including percentage data, life expectancy (ranging from 40-90), a population ratio out of 1000 to represent birth rate and maternal mortality rate, and GDP represented in Billions, USD-price adjusted.
Technical challenges that arose included:
- Creating a requirements.txt file with packages in a version that enabled the dependencies work smoothly and would not cause the app to break for users who want to run it in their local environment
- Fixing environment problems that surfaced when VSCode needed to have the correct Python version running for the app.py file and a separate environment for exploratory data analysis in Jupyter notebook
- Finding missing packages for certain graphs: โstatsmodelsโ had to be installed and added to the requirements.txt file, whereas this is already part of R
- Learning to create a user interface and write in a UI code with Python shiny (Certain UI code had been deprecated already for Python Shiny after being live for less than one year. The code was a little different, from the very few existing apps that could be used for UI inspiration.)
- There weren't any Python Shiny Apps to use as inspiration online that were also using Plotly geo-visualizations
- Writing some HTML and CSS to tidy up the style and design of the final app version
Deciding on the final view of the app was difficult. I had three goals:
1 - Portray the data into neat visualizations that serve as an interactive learning experience,
2 - Create a great user experience with a logical flow of information, and
3 - Produce visually-appealing maps.
I tried Pyplotโs ipyleaflet basemaps, GeoPandas, Folium, and eventually settled on Plotlyโs MapBox package. At first, I thought it would look interesting to have the option to view multiple ipyleaflet basemaps on one navigation tab. In the end, though, I opted for Plotlyโs MapBox in the earth-terrain style for the home page and a second page with either a white base map or dark base map that correlates to the time of day; after 5pm, the dark map will appear. Overall, Plotlyโs MapBox was not just the most functional and intuitive choice; it also offered the best aesthetic for displaying geo-visualization data. In contrast, ipyleaflet did not have much documentation, and some of its maps had been deprecated and were no longer available as of October of 2023.
When it came to creating the data visualizations, I found Plotly to once again be far more effective than pandas, matplotlib, and seaborn as it allows users of the app to effectively zoom in and out of Plotly base maps AND interact and play with Plotlyโs wide range of graphs to fully understand information better than they would on static graphs.
Data for this project was sourced from Kaggle, Gapminder, and the World Bank. I also used a few other sources found online to add historical CO2 Emissions, four letter Region codes, ISO codes, and continents. If I had to do it over again, I would pull more data straight from Gapminder, where there are a lot of really rich historical datasets that could have provided decades worth of depth on the historical data tab. In retrospect, though, It was good practice merging the datasets to create the historical data page.
Discovering questions and answers
Through the course of working with the original dataset and then creating a second dataset with historical data, some research questions naturally unfolded.
1 - How can the worldโs information be simplified with geo-visualization?
2 - What insights can be found in the data from 2023 with interactive data visualizations?
3 - What insights, patterns, and trends in the historical data led us to the current global status?
These questions can be answered by the user of the app who explores the maps to understand the world data through geo-visualizations. The current year dataโs interactive visualizations shed light on the state of the world in 2023, and the final page of the app can be used to explore interactive Plotly graphs with historical data.
Who would find the app useful
Ideal Users for the app were kept in mind while developing it. At first the idea was to build a useful app for educators and researchers. It could have a broader base of users if further developed. At itโs current stage, there are a few ideal users:
- Data-Driven decision-makers who need socioeconomic data analysis and geo-visualizations.
- Educators and Advocates: Specialists who use technology and data for education and advocacy.
- Global Health Experts: Professionals who need annual or up-to-date health and environmental data.
- Economic Analysts and Environmentalists: Consultants who require data on economic and environmental trends.
If further developed, the app could have an even wider range of users.
Key takeaways and insights
Here are some of the more interesting graphics and insights I found in my own research:
Higher education shows a correlation of lower birth rates around the world. This could reflect a couple of insights. One is that there is more opportunity for women around the world to get an education and work, and subsequently, women have fewer children. It also may reflect the amount of debt higher education creates, which may leave people reluctant to incur the additional expense of a large family.
Here are more visualizations below, which can also be found on the app dashboard here: World Atlas App