World Health Expenditure and its Relation to Life Expectancy

Posted on Nov 21, 2016

The World Health Organization (WHO) was founded in 1948 by the United Nations. The organization works with countries’ governments and other associations to ameliorate the quality of health for everyone. For the WHO and many other organizations, it is useful to know how to gauge the overall quality of health in a country and more so to know what factors are of the greatest influence. The WHO webpage has information on the life expectancy, % GDP expenditure, expenditure per capita, and other indices for each country. One often hears on the news that the US has the highest %GDP expenditure on health in the world, but many countries have better health indices. Here, I use the data that I scraped from the WHO webpage ( to show that %GDP expenditure is not the best predictor for better health indices, but that the expenditure per capita has a stronger relationship with life expectancy. A resulting linear model can function as a guide to how much life expectancy can be affected given certain changes in expenditure per capita.


Figure 1 The colors for the bars represent grouping according to the value of male life expectancy

By looking at a plot of the two measures of expenditure considered here in Figure 2, we see that expenditure per capita has a much stronger relationship with male life expectancy then does % of GDP expenditure. Actually, expenditure per capita seems to grow exponentially with male life expectancy.


Figure 2

This exponential relationship between expenditure per capita and male life expectancy can be exploited to create a linear model. In this model, we ask the question “how does male life expectancy change with the natural log of the expenditure per capita”. The resulting fit of the linear model to the data from the WHO website is shown in Figure 3. The intercept and slope coefficients for the model are β0= 38.916(years) and β1= 4.657(years/log($)). The p-values for the two coefficients are much less than 0.05, implying that both values are significant. The slope value β1 shows a positive relationship between the two variables and indicates how much the male life expectancy changes with a change in the log of the expenditure per capita.


Figure 3

Although it seems that the model in Figure 3 forms a good description of the data, a check of whether the assumptions for forming a linear model are met is warranted. Figure 4 shows some of the diagnostics used to validate our linear model in Figure 3. The positive diagnostics are:

  1. the residuals vs. fitted curve shows a flat and linear line
  2. the scale vs. location plot shows that different regions of the fitted values have similar variance
  3. the residual vs. leverage plot shows that there are no points that are influential (both outliers and have high leverage), and none of the points fall close to the Cook’s 0.5 or 1.0 lines.

The only possible drawback for this model is found in the normal Q-Q plot. While the plot shows that for most of the quantiles the relationship is linear, at the lower quantiles the points deviate slightly from linearity, indicating that the distribution here may be skewed and not normal in form.


Figure 4

It is possible that a different relationship between expenditure per capita and male life expectancy, other than the log(expenditure per capita), may help with some of the problems in the normality of the lower quantiles. To this end a Box-Cox transformation may be performed to find a different linear relationship. However, considering most of the diagnostics, the present model shown in Figure 3 is a good linear model and directly relates expenditure per capita to male life expectancy. The results here show it may be more prudent to discuss health indices with respect to the expenditure per capita in a given country instead of the % of GDP expenditure.


About Author

Related Articles

Leave a Comment

No comments found.

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 airbnb Alex Baransky alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus API Application artist aws beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep Bundles California Cancer Research capstone Career Career Day citibike clustering Coding Course Demo Course Report D3.js data Data Analyst data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization Deep Learning Demo Day Discount dplyr employer networking feature engineering Finance Financial Data Science Flask gbm Get Hired ggplot2 googleVis Hadoop higgs boson Hiring hiring partner events Hiring Partners Industry Experts Instructor Blog Instructor Interview Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter lasso regression Lead Data Scienctist Lead Data Scientist leaflet linear regression Logistic Regression machine learning Maps matplotlib Medical Research Meet the team meetup Networking neural network Neural networks New Courses nlp NYC NYC Data Science nyc data science academy NYC Open Data NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time Portfolio Development prediction Prework Programming PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn Selenium sentiment analysis Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau team TensorFlow Testimonial tf-idf Top Data Science Bootcamp twitter visualization web scraping Weekend Course What to expect word cloud word2vec XGBoost yelp