Data Measuring Stress Levels

Posted on May 5, 2022

Data Science Background

Stress is defined as a feeling of emotional or physical tension that can manifest as depression, anxiety, headaches, or nausea and can impact our work productivity and social life. If left untreated, stress can take over our lives. However, the good news is that there are many ways to combat stress and lower stress levels over time. This project is about assessing the data of the stress levels of males and females over 7 years time and compares their associated support systems.

Data Overview

The data set was gathered to understand how to optimize well-being and what are possible predictors of a well-balanced life. This survey is posted on The Authentic Happiness Project website and is still active. For the purpose of this project, the data set only covers the years 2015-2021.

This data set spans 15,977 observations while asking 24 questions about stress and support.

Data Analysis

Visualizing the Data

To visualize this data, I created an R Shiny App where you can toggle between the years 2015-2021 and genders being male or female.

Each graph is measuring either stress levels or support of the following age groups:

  • Less than 20
  • 21 to 35
  • 35 to 50
  • 51 or more

Starting at the top left, this graph is showing the daily stress levels people experience from 0 to 5.

The upper right-hand graph shows their support system, which is how many people they are close to and can confide in.

The lower left-hand graph shows the time spent in hours on weekly meditation.

The last graph is divided up by age group comparing stress levels to the daily number of hours spent on passions/enjoyable activities.

In creating my R Shiny App, I understand many factors contribute to daily stress levels, but out of the 24 factors that were surveyed, I felt these three were the most applicable to possible stakeholders as they are showing which survey takers have access to speak to someone and what actions are being taken to lower stress.

Data Analysis

Taking a look at females in 2019 (the graphs shown above), we can see that the women between the ages of 21 to 50 are experiencing the highest amount of stress when looking at the percentage of those age groups experiencing daily stress levels of 4 to 5. However, with age, the general trend is becoming less stressed over time rating their stress levels as 0 to 2.

When comparing this to support systems, the trend for this group is as the person ages their inner circle is growing on average which could account for their decreased stress levels since they have more people to confide in.

I don’t find it surprising that the females from ages 21 to 50 are meditating less and are dedicating less time to their passions, given they have the most level 4 and 5 stress levels.

This app has the capability of gathering information as what is listed above across genders and years in any combination which leads to a general knowledge of how the population is trending and more importantly draw conclusions focus points to improve daily stress levels.


The power of analyzing data comes in demonstrating its importance to others. With this data, there are two important questions we need to answer:

  1. Who needs this product?
  2. How is it applicable for them?

Who needs this product?

This data analysis could be used by meditation and talk therapy apps. These apps are created for people seeking relaxation, stillness, and guidance on combating the stresses of life. This audience aligns with the data collected because everyone who completed the survey went to a wellness site, so they are seeking out ways to better their lives.

How is it applicable to them?

I propose a two-step process for these companies to monetize this research.

  1. Create Target Marketing Groups
    Using this data, it is possible to create combinations of personas who would are most in need and would benefit from their services.
  2. Create a Tiered Marketing System
    Once the groups are identified, they can be ranked by who is most likely to take the next step and take action to download the apps. By using a tiered system they are allocating the most resources to the group which would give the highest return and less to those further down in the ranking system.

This method allows these companies to get the most out of their investment since they aren't consuming advertising costs on people who wouldn't be responsive.

Further Research

This research is a great start, but can be improved by some of the following projects:

  1. Cross-reference with current events to create relevant topics on the app
    - Apps can create content that gears towards common feelings based on what is happening in the world/specific geographic areas
  2. Compare stress levels across different major events
    - This data spans many years and is ongoing, so it can be compared against major events
    - One example could be pre/during/post COVID
  3. Conduct A/B testing
    - This will communicate to companies the effectiveness of these campaigns and how to adjust them in the future

Thank you

Thank you for taking the time to read about my project! Please leave a comment or question below with your thoughts and ideas.

For more, feel welcome to connect with me onΒ LinkedIn!



Featured Image:

Data Set:

Stress Information:

Berger, Fred K. β€œStress and Your Health: Medlineplus Medical Encyclopedia.” MedlinePlus, U.S. National Library of Medicine, 10 May 2020,,danger%20or%20meet%20a%20deadline.

The skills I demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.


About Author

Jessica Rodriguez

Hello! I am Jessica Rodriguez. I was a math educator for two years before becoming an equity research assistant. Now I am looking to begin my career as a data scientist.
View all posts by Jessica Rodriguez >

Leave a Comment

No comments found.

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI