New York City Leading Causes of Death Analysis

Yaxiong Huang
Posted on Jan 31, 2017

Introduction

The leading causes of death has a significant impact on different race ethnicity in the New York City.  A race might have more deaths than other race from one leading cause to another leading cause. There might be correlation in leading causes, deaths, race ethnicity, sex, and heath indicators.  The analysis would be useful for department of health in disease control and medical research.  It would be interesting to find out the following questions by analyzing NYC Leading Causes of Death data (2007-2014) and NYC Health Indicators (2012-2014).

  • How are deaths changing over time? Are there more deaths for males than females? What is the trend for individual race?
  • Which leading cause has the main impact on the change of the trend?
  • Which race has a higher death number? Which leading cause contributes to it?
  • How do the health indicators show the related death causes?

Data Sources

NYC Leading Causes of Death data (2007-2014): https://catalog.data.gov/dataset/new-york-city-leading-causes-of-death-ce97f

NYC Health Indicators (2012-2014): https://www.health.ny.gov/statistics/community/minority/county/newyorkcity.htm

Data Cleaning

Merged the NYC Health Indicators data table to NYC Leading Causes of Death data table.

Created new columns from NYC Health Indicators and used R manually updated the column values.

A glance at the Shiny App 

Shiny App

The shiny app gives users the capability in comparing deaths by leading cause, year, sex, and race ethnicity.

Screen Shot 2017-01-30 at 12.38.28 AM

The 2007-2014 death trend shows there is a dramatical drop from 2008 to 2009, and the year of 2012 has the least death.  This is because of the death of some major leading causes have increased.  From 2008 to 2009: Chronic Liver Disease and Cirrhosis, Nephritis, Nephrotic Syndrome and Nephrisis. From 2011 to 2012: Mental and Behavioral Disorders due to Accidental Poisoning and Other Psychoactive Substance Use, Human Immunodeficiency Virus Disease.

 

Analysis

We have seen the death trend for male and female together.  Now we look at the death trends for the male and female.

Death Sex Trends

https://gist.github.com/tommyhuang1/807fcd45c437f8c8de3944ff23380078

The females have more deaths than males.  The trend for females is very similar to the trend for all males and females together.  This tells us that female deaths a significant impact on overall death trend.

race trend

https://gist.github.com/tommyhuang1/4062f71a434c054acf8b6dd1e2e72834

The death trends for different race ethnicity shows White Non-Hispanic has the highest deaths with a downward sloping trend.  This means the health of white Non-Hispanic has been improved over time.  Asian and Pacific Islander has the least deaths with a upward sloping trend.  Their health is getting worse gradually.  The shiny App gives users the capability in looking at the trend for individual race and comparing the trend between race ethnicity.

Top 5 Leading Causes

https://gist.github.com/tommyhuang1/9d50372c34837a2cc471b17063ef0fb9

Heart Diseases is the number one leading cause but there might be different types of heart diseases. Also, all other causes are ranking number three. What are all other causes? Perhaps, health indicators might tell the story behind it. Coronary Heart Disease has high mortality for Black Non-Hispanic and White Non-Hispanic. Black mortality are slightly higher than white for the heart disease/stroke. White has a high risk on unintentional injury and elderlies have higher risk to fall. Black has high risk of Asthma/Chronic Lower Respiratory Hospitalizations. White has high suicide mortality and black has high drug related hospitalizations. Black has higher risk in diabetes. Both black and white are at high risk in cancer. Black has higher birth related mortality.

HeartDisease

https://gist.github.com/tommyhuang1/15a16f8dea0ea32352b7b23a3f4db32b

We see heart diseases for blacks and whites are higher than other races.  We can say they are the majority of races in having more heart diseases.

Suicide

https://gist.github.com/tommyhuang1/dcfe0d3e9f98c721d5d92aa8c37bbe54

 

Whites have higher suicide mortality deaths.  Also, we are surprised that Asian and Pacific Islander is with the number 2 ranking.

Diabete

https://gist.github.com/tommyhuang1/9fc90aaaf849618bd56ac2628d09aa49

 

Black has more diabetes mortality.

cancer

https://gist.github.com/tommyhuang1/3551f48c340ac251caf605934b67d335

Black has more cancer mortality.  Especially, Female breast cancer mortality is for higher for both blacks and whites.

For more details of the codes, please see R Shiny Source Code

Conclusion

Through the data exploratory data analysis, we conclude that the death trend of females has a significant impact on the overall trend. The health of females got significant improvement in 2009. Females have more deaths than males and breast cancer is one of the major leading causes of death for females. Black and White have higher health issues/hospitalizations than other races.

 

About Author

Yaxiong Huang

Yaxiong Huang

Tommy Huang received his Master of Arts in Statistics at Hunter College and Bachelor of Science in Mathematics and Economics at College of Staten Island. He has 7 years of experience in catastrophic modeling research for the insurance...
View all posts by Yaxiong Huang >

Related Articles

Leave a Comment

No comments found.

View Posts by Categories


Our Recent Popular Posts


View Posts by Tags

#python #trainwithnycdsa 2019 airbnb Alex Baransky alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus API Application artist aws beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep Bundles California Cancer Research capstone Career Career Day citibike clustering Coding Course Demo Course Report D3.js data Data Analyst data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization Deep Learning Demo Day Discount dplyr employer networking feature engineering Finance Financial Data Science Flask gbm Get Hired ggplot2 googleVis Hadoop higgs boson Hiring hiring partner events Hiring Partners Industry Experts Instructor Blog Instructor Interview Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter lasso regression Lead Data Scienctist Lead Data Scientist leaflet linear regression Logistic Regression machine learning Maps matplotlib Medical Research Meet the team meetup Networking neural network Neural networks New Courses nlp NYC NYC Data Science nyc data science academy NYC Open Data NYCDSA NYCDSA Alumni Online Online Bootcamp Open Data painter pandas Part-time Portfolio Development prediction Prework Programming PwC python python machine learning python scrapy python web scraping python webscraping Python Workshop R R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn Selenium sentiment analysis Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau team TensorFlow Testimonial tf-idf Top Data Science Bootcamp twitter visualization web scraping Weekend Course What to expect word cloud word2vec XGBoost yelp