Data Trends on US Obesity
The skills the author demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.
Introduction
The main objective of this study was to observe the data and demographics of obesity in the United States. Obesity remains an important issue because almost half of the worldβs adult population could be overweight or obese by 2030 according to The Global Obesity Threat published by McKinsey&Company. Obesity is often related to health conditions such as heart disease, stroke, type 2 diabetes, certain types of cancer in addition to contributing to economic and productivity costs.
My main goal was to identify distinct segments with high obesity within the US population. Market segmentation could allow for more effective targeting of products and marketing campaign such as promotion of gym memberships, healthy eating programs, or healthcare discounts in order to move Americans towards healthier weights.
The datasets for analysis were obtained from Data.gov. The datasets include national and state specific data on youth and adultsβ diet, physical activity, and weight status across years and throughout different demographics. Youth data were collected every two years from 2011 to 2015; and adult data were collected every year from 2011 to 2016. Please refer to my Shiny app (https://kellyho.shinyapps.io/ShinyProject/) for links to the raw datasets.
Data Analysis
The below heat maps show the percent population on the selective input chosen by the user. The selective inputs include obesity, diet, and physical activity attributes for both youth and adult groups. Looking at the obesity data, the southern states have the highest obese population, whereas states that have a more active lifestyle, such as Colorado and Hawaii, have the lowest obesity rates.
The user can further explore the patterns of different diet and physical activity across the states with the heat map. Another way to investigate the correlation between obesity and the behavior factors is through scatter plots.
Youth Obesity Trend DataΒ
The below scatter plots show a general trend between obesity versus the different behavior attributes that are included in the heat map selection. The data points on each scatter plot represent a different stateβs data over time.
The graphs below show that states with higher percent population of fruit and vegetable consumption tend to have lower percent population of obesity. But interestingly, physical activity does not have a significant correlation with youth obesity levels. This suggests a diet-focused strategy might be more beneficial in moderating youth obesity.
Adult Obesity Trend Data
Adultsβ obesity trend for fruit and vegetable consumption is similar to the youth group. On the other hand, physical activity also has a significant correlation with adult obesity. For each category of physical activity, the states with higher percent population that performs physical activity have lower obesity rates. The inverse relationship is shown for no physical activity. Both diet and physical activity play an important role in weight management for the adult group.
Demographic Trends
In addition to obesity data provided for each state, the datasets also provided a breakdown of obesity across different demographics for each state as well as the whole nation. Below are a few interesting trends I observed from different demographics at the national level.
For the age demographic, obesity tends to stay at the same level for youth group (age 14-17) but gradually increases in the young adult group (age 18-35) and reaches the highest from mid 40s to mid 50s.
For ethnicity demographic, obesity trends are similar among youth and adult groups. African American and Hispanic have higher obesity rates followed by Caucasian and Asian. But again, adults have a significantly higher obesity rate compared to the youth group.
Two additional demographic categories were provided in the adult dataset. The adult population with lower education (less than high school) have higher obesity rates (close to 40%) compared to college graduates (low 30%). Lower income population also tends to have a higher obesity rate.
Perhaps educational campaign on a healthy lifestyle should start at an earlier age. Weight loss products and health insurance plans could be personalized toward low education, low income, and even certain ethnic subgroups of the population.
Sharp rise of obesity in young adults
The previous demographic data showed a unifying trend where adults have higher obesity rates compared to youth across all years. I was curious to see how obesity changes across age and if the change can be explained by the behavioral factors captured in the dataset.
I took a snap shot in time and picked the most up-to-date year to construct the box plots below. The data points represent each stateβs data in 2015 β the median obese population sharply increases in the young adult group from age 18-35.
Fruit consumption has no obvious trends across age, and interestingly, vegetable consumption rises with age. This might seem to be a contradiction to the earlier trend observed for vegetable consumption, but it is not entirely because the earlier trend was constructed within individual groups of adult data and youth data respectively, whereas the comparison below is made across groups. The box plots below simply indicate that the diet parameters from the dataset cannot explain the sharp increase in obesity for the young adult group which I was interested in.
As for physical activity, different categories of physical activity are either similar or higher for the young adult group. This again indicates that physical activity may not be the main contributor to the significant increase in young adult obesity. Other factors such as high carbohydrate diet, sleeping patterns, stress levels, and changes in metabolism should be investigated.
Conclusion
- Youth obesity has a stronger correlation with fruit and vegetable consumption as opposed to physical activity. Parents and schools should focus on providing healthier diet to control youth obesity.
- Hispanic and African American have highest prevalence of obesity, followed by Caucasian and Asians for both youth and adult. Populations with lower education and income have higher obesity rates. Although genetics play a role, controllable behavior such as diet and exercise can moderate obesity levels for adults.
- As population with lower education and lower income have higher obesity rates, educational campaign on a healthy lifestyle should start at an earlier age. Weight loss products and health insurance plans can also be personalized toward certain subgroups of the population.
- The sharp rise of obesity in young adults from age 18 to 35 could not be explained by the diet and physical activity attributes captured in this dataset. Other studies, such as sleeping patterns, stress levels, high carbohydrate diet, and metabolism could be conducted to determine the principal cause for the observed trend.