Data Visualizing Economics and Mortality

Posted on Oct 21, 2016
The skills the author demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.


Countries are economically different in various regions of the world.  The residents of wealthier countries are not only better off financially, but they also tend to live much longer.  How does the living conditions compare? What could be a possible reason for this? This project visualizes the economic conditions of different income regions and identifies how these situations compare over time.   The data on this project was found in the databases of World Bank and the World Health Organization.


During my study on Economics at New York University, I had seen a clip by Hans Rosling in which he showed a fantastic presentation relating income and life expectancy of 200 countries over the last 200 years.  The presentation was interesting in that it showed how the countries became wealthier and healthier at dynamic intervals.  The countries each had its setbacks from sudden disease outbreaks and political disturbances but nonetheless followed a positive trend overall.  As each country grew healthier with medical advancements, the countries began to grow wealthier.  What I wanted to see were the recent changes and possible explanations for them.


I. How Does the World Look Now?


Data Visualizing Economics and Mortality

The GDP per Capita appears to vary widely after between the life expectancy of 70 to 80 years.  Most of the poorer nations seem to be from South Asia and Sub-Saharan Africa.  Interestingly there is a cluster formed within the longer living but still, poor regions found in the top left corner of the graph.  It appears as if there is a particular milestone that must be reached before such an expansion of income can occur.

Data Visualizing Economics and Mortality

World Bank's metadata was used to categorize nations into 4 different income groups.  After taking the logarithm of the Average GDP per Capita, the plot revealed a correlation between the Life Expectancy and GDP per Capita.


II. How do Income Gaps Differ?

Income Gaps were calculated using the GINI coefficient of various countries reported by World Bank.  A higher income gap or a high-income inequality is expected to relate to lower values in both Average Life Expectancy and Average GDP per Capita.


Data Visualizing Economics and Mortality

The first plot shows several interesting patterns.  There is a pattern in which a High-Income Gap (GINI >44.14) decreases from the Upper- Middle Income to Low-Income.  However, the Middle-High Income Gap (37.59<GINI<44.14) increases from the Upper-Middle Income to Low-Income.  This suggests that income grows increasingly concentrated among a select few as countries develop.

However, it also shows that there is a greater barrier to entry to wealth among poorer countries as the distinction between the wealthy and poor grows.  This is highlighted in the second plot where overall income gap separated into two levels (High = GINI>34) is shown.  While wealthier countries have an extreme concentration to the select few, the overall spread of wealth is more even than in the poorer countries.

III. What Causes Such as Life Expectancy?











Low-Income nations suffer from high rates of communicable or contagious diseases caused by bacteria, viruses, and other microorganisms.  Unfortunately, infectious diseases also affect the entire population regardless of age, and because of that, the residents face an earlier death.  The development of medicine, vaccines, and clean environment allow wealthier nations to avoid such illnesses.  The Low-Income regions are not without hope, however.  Over time, cases of communicable diseases are decreasing at a rapid pace, indicating better health and greater longevity.


lower-cause lower-middle-cause legendupper-middle-cause high-cause


As noted previously, communicable diseases such as infectious and parasitic diseases are much frequent in the Low Income Region than any other regions.  Fortunately, there is a definite decrease in such diseases over time everywhere.  However, cardiovascular diseases and malignant neoplasms rise over the years and as countries grow wealthier.  This appears to be a direct result of longevity.



The benefit in the lower mortality from communicable diseases is evident.  In the year 2012, while all income groups saw a growth in both life expectancy and GDP per capita, it was especially more noticeable in the lower income groups which saw an adamant boost.




We see a stagnation in growth as countries develop.  For the Low-Income Regions, the points become highly segregated in time, but in the High-Income Regions, the points overlap mostly showing not a large degree of change.  There appears to be a decrease in the marginal rate of development as countries grow wealthier.




There is a strong positive relationship between Life Expectancy and GDP per Capita.  Residents of countries with higher income typically not only earn more money but also, live longer.  Wealth appears to be more accumulated to the elites in higher income regions, but the real gap between rich and the poor is much evident in poorer countries.  Finally, mortality by communicable disease is a serious threat that affects low-income areas, but the conditions appear to grow better over time.


Next Steps:

In the future, this project would be updated in Shiny in which the data in between years can be shown to show a more fluid movement in time.  Other factors should be considered in visualizing the economic situation of the country such as the form of government, primary level of production or industry, and economic conditions on gender.



About Author

James Lee

James Lee is currently a Data Analyst at Facebook via Crystal Equation and a Masters in Data Science student at the University of Washington. He has a background in Economics and Mathematics from New York University, and has...
View all posts by James Lee >

Leave a Comment

No comments found.

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI