Data Analysis on The Mental Health Crisis
The skills I demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.
Github | Linkedin | Shiny
Introduction
Mental health is an issue that has recently come into the spotlight in mainstream media data here in the United States. It is a multi-sector issue that involves healthcare professionals, all branches of the government, and expands to patients, caregivers, and family members. It impacts all facets of the workplace and everyone who is on social media.
I wanted to dive in and take a look at the hard numbers and see the story the data was telling. "Is it an issue?", "Just how bad is it?", "Is it only impacting the United States?", these were some of the few questions that came to light when first thinking of this topic.
Goals
My goals were twofold for this analysis:
- For starters, it was to discover answers to the questions stated with hard data to support any claims.
- But more importantly, I wanted to spread awareness on Mental Health with some statistical insight to those who may be impacted by it either directly or indirectly.
Data
- Time Period: 1985-2016
- Compiled by: World Bank, World Health Organization (WHO)
- Countries: 101 countries
- Gender: 2 gender classifications
- Age Groups: 6 different groups
- Yearly GDP: for that particular country
- Population: split specifically for a particular group
- # of Suicides: split specifically for a particular group
Data Scrubbing
The data integrity for years prior to 2002 and in the final year of 2016 was not consistent. In 2016, it appeared that a lot of data was missing because it had yet to be compiled or collected and for years prior to 2002 there were missing data from countries that had yet to start tracking or didn't have consistent tracking.
Therefore, I filtered the data accordingly:
- Time period shortened to 2002-2015
- Total of 51 countries
Data Analysis
Initially, I wanted to take a wide-angle lens approach and see everything on a macro level:
This is what we hoped for but looks too good to be true.
How is the spread of the data for each country?
It looks pretty wide in a decent chunk of the countries.
Data on each country from the 2002-2015 period.
This chart doesn't seem to say much but does highlight Europe & Central Asia being a bulk of the highest suicide rates for this period.
Gender Data
Looks like there's a trend toward Males being more prone to suicide than females.
Is there a better way to visualize this?
Yes, this chart beautifully illustrates the ratio split by gender. Note the 2 lines representing the average for females on left and males on the right.
Age Group Data
A disturbing chart that illustrates that elderly age brackets are at higher risk for suicide than younger age groups.
This made me think, what if we evaluate generations instead of just age groups?
Adding GDP into the equation
Finding a country with Increasing Suicide Rates
I thought correlation would be the best way to tackle finding the outliers that were showing increasing suicide rates vs the first chart where we saw a decreasing trend of suicides globally. This chart illustrates a clear visual representation of that.
Evaluating the U.S (2nd on the list)
The most disturbing chart for me is this one. Remember we saw that the older age groups were higher risk to suicide? According to this chart, in the U.S it's actually the 35-54 year old category.
My Conclusion
From the analysis shown, we can infer the following:
- Global suicide rates have been decreasing from 2002-2015
- This doesn't paint the whole picture
- Countries such as the United States have been outliers and have had a problem with increasing rates of suicides
- The United States is also an outlier where the 35-54 age bracket is at the highest risk!
- Elderly Population at higher risk overall compared to the younger population
- Males have a much higher risk of suicide than females
Final Thoughts
I feel that this is the first step in the right direction to better understand suicide rates and the mental health crisis. I just wish there were more years of data that we had available from all countries for a more conducive analysis.
To highlight some of the issues I encountered:
- Helping smaller countries and countries with smaller GDP better track their data
- Seeing if any missing historical data is available and providing it to the World Health Organization
- Fact-checking the data to ensure the integrity of the numbers
Next Steps
I would love to eventually revisit, rethink, and add to this project. I feel these were some future thoughts worth considering:
- Bringing in more individual characteristics as factors to consider with this analysis (ethnicity and education for example)
- Also to bring in more tracked key indicators to add layers to this analysis (i.e: happiness Index, unemployment rate)
- Incorporate the years during the COVID epidemic
A brighter future
This could be used to help build a predictive model with factors to predict future trends in suicide. The factor weights can be used to better understand which facets impact mental health the most. This could eventually be used as an effective tool to help combat this issue.
If you or anyone you know needs help:
National Institute of Mental Health: https://www.nimh.nih.gov/health/find-help
National Suicide Prevention Lifeline: 1-800-273-TALK (8255)
Sources:
World Health Organization (WHO): www.who.int
Kaggle Dataset: Dataset link
National Institue of Mental Health: www.nimh.nih.gov