Data Analysis on Stress Causes and Outcomes

Posted on Jun 13, 2021
The skills I demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.


What does the common data say is the sources of stress and how to manage them for a purposeful impact?


What is stress?

Simply put, Stress is the feeling of being overwhelmed or unable to cope with mental or emotional pressure. In the past few years there has been a staggering increase in number of stress related issues. stress results in ‚Äúaccidents, absenteeism, employee turnover, diminished productivity, and direct medical, legal, and insurance costs‚ÄĚ that cost the United States $300 billion every year.
According to The American Institute of Stress:                                              

  • About 33 percent of people report feeling extreme stress
  • 77 percent of people experience stress that affects their physical health
  • 73 percent of people have stress that impacts their mental health
  • 48 percent of people have trouble sleeping because of stress

Stress can affect all aspects of  life, including  emotions, behaviors, thinking ability,  physical health, professional achevements and personal  relationships.

Why do we care?

The first step in fixing any issue is to know the causes. The benefit of this analysis can be two folds. On a personal level Identifying the triggers of stress can help improve lifestyle, emotional and mental health. It can direct towards making the necessary changes to manage and lower stress levels. From a business perspective, focusing on reducing stress can help with increased productivity,  energy levels, employee engagement indirectly impacting company’s bottom-line.   
There are several organizations that could benefit from stress management exercises by tailoring their marketing programs for a higher purpose and business outcomes. Fitness centers, wellness facilities, meditation and yoga camps, sports wearable companies, food catering and cafeterias.

The Data

For the analysis I choose the lifestyle and wellbeing data from This dataset contains the survey responses from
There are 24 attributes describing how we live our lives and thrive   both  professionaly and personally: it reflects how well we shape our lifestyle, habits and behaviors to maximize  overall life satisfaction along the following five dimensions:

Healthy body, reflecting your fitness and healthy habits. Healthy mind, indicating how well you embrace positive emotions; Expertise, measuring the ability to grow your expertise and achieve something unique; Connection, assessing the strength of your social network and your inclination to discover the world. Meaning, evaluating your compassion, generosity and how much 'you are living the life of your dream'.

Python libraries used

  • Pandas
  • Matplotlib
  • Seaborn
  • Numpy

Exploratory Data Analysis

For this dataset stress is measured on a scale of 0 to 5. Most of the variables are either binary or have fixed set of values. Based on the representation of values I calculated average stress to be the dependant variable. I then analyzed the impact of other variables on average stress levels.

Data on BMI, Fruits and veggie to average stress.

The BMI range in the data is categorized in two groups. Population with BMI Below 25 and above 25. Looking at the graph we can say that stress levels rise with the increase in bmi. People with BMI above 25 have 8.6% increase in average stress levels than people below 25.

1 ‚Äď BMI below 25 2 - BMI above 25

Data Analysis on Stress Causes and Outcomes

8.6% increase in average stress

As with our physical health, stress levels are also impacted by the intake of fruits and vegetable, and sure enough the graph reflects the same. People having 5 servings of fruits and veggies in a day experience 16.1 % decrease in average stress.    

Data Analysis on Stress Causes and Outcomes

16.1% decrease in average stress

Data on Sleep hours and average stress

We cannot undermine the importance of sleep for a better quality  of life. Very obvious question that pops up looking at the graph is, is it possible for people sleeping just one hour a day to have lower stress levels? Looking into details I found that it comprises of a very small percentage of the survey population, 0.1 % to be precise. I found the number insignificant enough to exclude from the analysis. The resulting graph  makes much more sense. There is a decline in average stress with increase in sleep hours.

Data Analysis on Stress Causes and Outcomes

Social network, core circle and average stress Data

According to the survey question social network represents the number of interactions in a day .  One may expect social networking to have a lowering  impact on stress levels.  but this was an interesting find.   The responses range from a value of 1 to 10 in the data. Notice that having no interaction or meeting  10 people may cause stress levels to rise comparatively. On the other hand having a higher number in  core circle which counts as number of people close to you can have a declining impact on average stress

Places visited and sufficient income

The number of places visited in a year  has  positive impact on stress levels

People who visit 10 places in a year experience 21.5 % less stress than people who don’t visit any.

Data Analysis on Stress Causes and Outcomes

 21.5% decrease in average stress

Again income in this dataset is categorized into groups, population having suffient income or hardly sufficient income.  people with insufficient income have 16.7% increase in average stress levels.

1 ‚Äď Hardly Sufficient 2 - Sufficient

16.7% increase in average stress

Weekly meditation and time for passion

Weekly meditation and time for passion have a therupatic impact on stress levels. Based on the given data people who meditate 10 hours a week have 32.1% decrease in average stress levels. While many people may not have   10 hours in a week for meditatation but looking at the graph we can say that even spending 1 to 3 hours may help lower stress significantly.

32.1% decrease in average stress

As per the survey questions, time for passion is described as the total number of hours people spend in a day doing something they like and are passionate about. Spending 1 to 3 hours in a day may  help lower  stress levels though anytime spend more than 4 hours would not make much difference.

26.2% decrease in average stress

And finally I created a  heatmap,  to help identify  the highly coorelated variables. Each square shows the correlation between the variables on each axis,  the larger the number and darker the color the higher the correlation between the two. A positive or a negative number indicates a positive and negative correlation respectively.


Upon analysis over 10 attributes  in various dimensions we can conclude the following:

    • Based on the given data BMI has the strongest positive correlation with stress (coefficient = 0.8)
    • Weekly meditation and time for passion have the strongest negative correlation¬†( coefficient= 0.21, 0.15) respectively.
    • Balance is the key when it comes to social networking, having no social interaction or having too much may¬† cause stress levels to increase.
    • Hours of sleep, exercise and¬† good nutrition can help lower stress reasonably.
    • Correlation of stress with people supporting others or donating is not very significant.

About Author

Rupali Pahouja

NYC Data Science Academy Fellow, proficient in Python, R, and Machine Learning skills.
View all posts by Rupali Pahouja >

Leave a Comment

No comments found.

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI