Global Life Expectancy Explorer: What happened when the floor dropped out?

Posted on Apr 14, 2017

Robert Frost, one of the great American poets of the early twentieth century, once stated, "In three words I can sum up everything I've learned about life: it goes on." If we can take his observation and apply it to the average life expectancy of humans across the globe as a single group, we can say that life not only goes on, but, fortunately, also up. (see graph below)

world_lexp

World Average Life Expectancy

But, alas, the devil is always in the details. That is where this first Shiny data science project leads me: an exploration into the not so rosy life expectancy data around the world.

The Data

Although there are numerous open data sets around the web, one of my favorites is The World Bank. Specifically for this endeavor, the HealthStats portal published via the World Bank Group. The data set provides female and male life expectancy information for 253 countries and covers a time range from 1960 through 2014. It's a substantial collection of data for my initial inquiries.

Here We Go

Every investigations needs at least one tool. Here we leverage Shiny to dive past the very general trend line of a global population and graph separate gender box plots of all countries to see if anything stands out.

project_1_srn_shot1

project_1_srn_shot2

Looking at the plots, it becomes immediately obvious that although the median trend of countries is positive, we have some serious outliers in the data. Overlaying the gender differences provides extra contrast.

project_1_srn_shot3

Why Did The Floor Drop Out?

The outliers show a significant drop in life expectancy. So what countries are causing these? Ideally, the Min/Max selection would already identify the data points and display them below the plot. It's a feature still in work, so we can make use of the tab panel that displays the raw data set.

project_1_srn_shot5

If we note the years of interest below the outliers from the box plot, we can locate the equivalent year/column in the data set. Sorting the year column in ascending order allows us to identify the countries with the minimal life expectancy numbers. Filtering individually on the top five countries in the primary plot reveals the devil within the details.

project_1_srn_shot6

Cambodia

After searching the affected years in Cambodia, we find the following significant events: Cambodian Civil War (1970-1975), Khmer Rouge regime (1975-1979). The later event was well known as the "Khmer Rouge Killing Fields". 1 million+ people were killed during this period.

project_1_srn_shot7

Sierra Leone

Here we have the Sierra Leone Civil War (1991-2002) with 50K+ people killed. This is one of the countries of the "Blood Diamond" infamy.

Rwanda

Rwanda

Rwanda was cursed with a civil war (1990-1994) that quickly became genocide (~1994). It was estimated that 800K+ people were killed in 100 days.

Zimbabwe

Zimbabwe

Zimbabwe primarily suffered from a devastated economy and massive food production shortages due to government actions during the affected years.

project_1_srn_shot10

Swaziland

Health issues seem to be the main culprit affecting Swaziland. Tuberculosis, with HIV/AIDS being especially devastating (WHO data in 2002 shows that 64% of all deaths in the country were caused by this illness), contributed the lion share to mortality statistics.

Where To Next

An initial glimpse deeper into the data shows some periods of painful rot beneath the healthy global trend concerning life expectancy. Continued enhancement in the Shiny app will provide more efficient insights into the what/when/how of country data points. The following modifications should help with that goal:

  • Integrate other health/cause-of-death/disease data sets
  • Integrate war/conflict data sets
  • Global maps for country selection/visualization
  • Fractal plot layouts: The graphs used show noticeable differences in gender life expectancy depending if primary causes were conflict versus health/hunger. Those insights are easier to see when plots are positioned next to each other.

About Author

Related Articles

Leave a Comment

No comments found.

View Posts by Categories


Our Recent Popular Posts


View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI