Data Study on Cancer Mortality in the United States

Posted on Jul 21, 2016

Data Study on Cancer Mortality in the United States

The skills the author demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.


Data shows the utterance of the word itself would send tremors reverberating across an entire life force that changes your very existence in every way. To some, the spectre of a death sentence seems inevitable. But then to others, it's a call to rise up for the biggest battle of their lives.

Sisyphus was supposed to be a cunning king in Greek mythology. But he was too smart for his own good and fate eventually caught up with him. Mysteries surround the reasons why he was banished. Perhaps he consorted with the wrong people or angered the gods with his crimes or indiscretions.

Sisyphus was punished and sent to the depths of hell to pay for his unforgivable deeds. The gods condemned him to an eternity of labor to push a huge boulder up a hill only to have it roll back down again and he has to do it all over again. Forever. Despite all this, Sisyphus continued to roll the boulder up the hard steep hill and never ceased in his efforts. His efforts were viewed by all as being absurd but nevertheless heroic in his stoic attempts.

And so it seems with us, fighting this terrible disease called cancer. We have all as individuals and as a nation faced the tragedies that this deadly killer has unleashed. The battle and agony it evokes push us to the limits of our human tolerance. But despite all that, as Sisyphus has done, as absurd sometimes as the effort seems to be, we continue to toil on to find a way to treat, and even perhaps cure it.

Why Do We Need to Care?

Cancer is a very personal matter. We are all affected by it in some shape or form. It affects us all as individuals, families, friends, and acquaintances. No matter what the statistics say, even if the probability of getting cancer is at 0.2%, if you're hit with it, despite how small the probability looks, you and your family are now part of that number. And when that happens, you don't really care about the numbers anymore, you care more about the fight.

The devastating effects of cancer on the lives of people are huge. The financial burden, the effort and energy to find the best treatments, doctors and clinics, the psychological hurdles, the brutal fight, and ultimately at the darkest moment, dealing with death itself. It crushes even the best of us. But despite all this, we need to be relentless in our pursuit of the cause and cure no matter how long or how much effort it takes.

It is even more critical that we arm ourselves with the right information to deal with cancer. Knowledge, experience, and insights gained will be our guide. The statistics are staggering once we look at the toll cancer takes in the US. It is important for us to know, and understand how bad the situation is, how it affects you, your loved ones, your family, your community, and we as a nation.

But the effort to conquer cancer is not without it's massive hurdles.

Data on The Difficulty of the Challenge

Data Study on Cancer Mortality in the United StatesThe cancer fight needs to be a no-holds-barred proposition as when cancer hits you, it hits hard with unrelenting force. It spares no one, for the people who have it, around the victims, and the expense everyone goes through as it impacts us as a nation.

The difficulty is due to the effort it takes to find what the cause(s) are and the search for the cure. For the President to have asked his VP to find the cure for cancer was a challenge to the status quo, to go beyond research work, which is still needed, and quickly strive for the solution.

This is a monumental task, but we are aiming high, and not giving up. It is so difficult that President Obama named the initiative as a moonshot.

One, it is to encourage us as a nation to know that curing cancer is not only possible, but is now an imperative. The same way that we thought going to the moon was just a dream, we were able to overcome all obstacles and eventually send a man to the moon, despite all odds. This is the inspiration that we need to keep us going, to pursue the cure for cancer no matter what it takes and make it a reality. However, it also connotes the immensity and severity of the challenge we will face in solving the cancer mystery.

We are inspired to find the cure for cancer. And it has to be achieved in this lifetime.

Drivers on Cancer

Data Study on Cancer Mortality in the United StatesIt is relatively less complex to go for the cure first, than figuring out the multitude of causes. The cause is based on an innumerable set of variables, and the immensity of combing through these variables to understand the correlations and level of significance on the causes would be at best elusive for even the best researchers and statisticians.

The Latin phrase "Ad hoc ergo propter hoc" means "After that, therefore it is because of that." The phrase is a reminder to us of the fallacy to think that just because something or some event or action occurred previously, it is the reason or cause of what has happened. This is simply not true.

The rigors of the analytical process demands that we need to work with massive amounts of test and research data gleaned from the works and results of the researchers, medical communities, and government agencies to arrive at statistically significant correlations and results. It is important we employ effective analytical approaches and inject a level of discipline to be able to find that one deadly needle in a deep and tortuous haystack.

Rocket Fuel to Drive the Moonshot

To help in the fight against cancer, we need to use our arsenal of tools and talent to answer the critical questions. The research and medical communities need all the assistance from the data experts at large to gain more headway. There are four approaches that answers critical questions to the journey that we need to be aware of:

  1. Descriptive Analytics
    • What happened? What is going on? How does cancer impact us?
  2. Diagnostic Analytics
    • Why did it happen? What caused it? What went wrong?
  3. Predictive Analytics
    • What can we do to predict and forecast future events and trends?
  4. Prescriptive Analytics
    • What should we do to cure it? What are the recommendations?

So we have to start from the beginning. We need to know how cancer became the ruthless killer that it is known for. We have to know what happened and how this scourge of the earth impacted us as a nation, and to this day, continues to wreak havoc to our families and health support systems. The spend in the US for cancer treatments is around $125B per year, and projected to go up to $158B by 2020.

Data on Cancer Mortality Count in the US


Let's start by looking at some stark statistics on the total mortality of the U.S. accumulated for 15 years from 1970-1994. This map shows a total mortality count of 9.5MM Americans who died from cancer. CA, NY, PA and the darker colored states on the map stands out as the top states having the most number of deaths for the 15 years when this demographic was taken.

There is a wide gap between the states with the highest count and lowest one, i.e. California at 929.5K, and Hawaii at 6.5K. This could imply that cancer seems to have relatively higher incidences of death in states known to have higher population and denser urban areas. As to what degree these effects drive the mortality still needs to be studied further.

Data on Cancer Mortality Rates in the US


One would think that the sheer absolute mortality count from the previous chart would also in some ways appear with a correlation to the mortality rate. However, looking at this map, this was not the case. The numbers on this map shows the number of deaths per state, per 100,000 Americans, from 1970-1994. MA, NJ, DE stood out as the top states with the highest mortality rates.

It implies for those that live in these states that they would have a higher probability of dying from some type of cancer compared to the rest of the country. Being considered as more dangerous states to live in, it would be an interesting follow-up research to find what could be driving these states to have a higher mortality rate relative to every other state in the US. The average mortality rate of the US comes in at around 165.7 deaths out of 100,000 Americans.

Data on Cancer Mortality Count by Cancer Type

type mortality rate

Now that we understand the entire death toll cancer takes on our country, let's break down what types of cancer rank the most insidious in its quest to take lives. Lung cancer blew past all the rest of the other types of cancer by a wide margin. That's 140% more than the 2nd most ominous one, which is colon cancer. If ever you see more ads and effort being spent on lung cancer, now you know why. The gray sectional midpoint vertical line is the median marked with a 95% confidence level. The cancer types above ovarian cancer shown on this plot shows the sheer number of deaths for Americans caused by said types.

Cancer Mortality Count by Type vs Gender


This shows a further drill down of the previous bar chart to show the female and male mortality count based on the cancer type that the gender has more affinity to. The constant battles that women have to go through to fight breast cancer is evident in the chart above. While lung cancer is not the number 1 killer for women, it is so for the men, and by an overly huge margin over prostate cancer. Colon cancer placed a common third position for both sexes.

States Having the Highest Mortality Rate Based on Cancer Type


It is worthwhile to be aware what states have the most mortality counts based on the cancer type. This could lend some much needed clues to narrow down future analysis on what cancer type to focus on that each state may want to focus their energies on to combat. Note that these demographics were based on mortality per 100,000 Americans measured across 15 years of state data.

Some noteworthy findings are that Alaska does not appear in any of the top 3 spots for mortality. NY has the highest mortality rate due to rectal cancer. NJ leads its mortality rate with uterine cancer by women, followed by a triple threat of bladder, colon, and ovarian cancer, notwithstanding even placing third with both breast and rectal cancer. NJ has not been too kind to the female gender. Hawaii leads in its mortality rate on oral cancer, the only one where Hawaii appears in on this chart. Rhode Island leads in its mortality rates with the top spots in several cancer types, namely, brain, breast, colon, and stomach cancer.

Further research on Rhode Island might be a worthwhile exercise to drill down what is driving such multiple high probability rates.



9.5MM Americans died from cancer within the span of 15 years. That's about 1.2 person a minute. For the time it takes for us to finish a cup of coffee, 12 people would have already died from cancer. And the next 10 minutes. Back to back. Pause and think about that for a moment. That is the truly sobering statistic actualized in real life and will continue to haunt us until eternity unless we stop it.

We now know how cancer impacts us as a nation, how many people have died from it and how dangerous this foe is to us. And in the spirit of Sisyphus, who through eternity rolls the huge boulder up and down the hill, the action in his own mind, is much needed.

Sisyphus never gave up, despite the agony and absurdity of fighting a seemingly unwinnable opponent, and that will be the same mission we all need to embark on as a nation. Some people contribute money to the fight, and many offer their efforts in various ways. The medical communities and researchers are at the forefront of this battle, and we can use the power of numbers to get closer to the cure.

That is what matters to us and that is what gives meaning to the fight, and to life.

More Questions Arise

  • How are the US and its various states cancer mortality trending these past few years? Getting better or worse?
  • What states are getting better or getting worse?
  • Have the most dangerous types of cancers changed over these past few years?
  • What are the most promising cures that are surfacing? Can we glean any hope from the immunotherapy results?
  • Do we have enough or available data to analyze some of the cancer drivers for the causes?
  • What predictions can we make on future mortality rates and counts in US?
  • And based on predictions, what else can we recommend?

by Bernard Ong, Data Argonaut
Email: [email protected]
Cell: 201-916-5241






About Author

Bernard Ong

Data Scientist with track record of driving innovative technology projects and programs to successful implementation. Blend application architecture and machine learning skills with domain knowledge to drive strategy and execution excellence. Background includes managing multimillion-dollar portfolios, turnaround initiatives,...
View all posts by Bernard Ong >

Related Articles

Leave a Comment

No comments found.

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI