Data Sets: Confronting The Outbreaks in a New Era

Posted on Nov 3, 2019
The skills I demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.

Mankind's fight against contagious disease has been a constant throughout history, and the tools we've developed for fighting contagion have evolved considerably. Unfortunately, the contagions that we fight are also evolving rapidly, and while we've won particular battles, the war goes on. In the 21st century, we have the tools of modern medical science: prophylactic measures, vaccines, antibiotics, antivirals, and supportive care.

On  the other hand, the contagions we face have unprecedented population densities to work with, disease vectors that can circumnavigate the globe in a day, and increasing levels of resistance to human countermeasures. All of these conditions contribute to the 21st century becoming a new era for disease control, with an increased urgency and intensity to the fight. And, just as 21st century warfare has been revolutionized by the battlefield pictures yielded by satellites and aircraft, it's time for a strategic picture of the fight against contagious disease.

 Seasonal pathogens provide an interesting example of how to implement new measures because they strike regularly and allow us to observe the entire cycle of an outbreak. The frequency of repetition also allows us to see the efficacy of new measures quickly, and removes much of the guesswork. 

Effects of Vaccination

For example, a recent study on the effects of vaccination in the 2017-2018 flu season has demonstrated that vaccination measures have achieved a 38% efficacy, which resulted in the successful prevention of 109,000 hospitalizations, and 8,000 deaths. Part of making influenza vaccinations this successful was knowing to to expect infections to begin in November, and to start vaccinating immediately in advance of that. 

Perhaps knowing to deploy an influenza vaccine in October doesn't seem so impressive by itself, but imagine what the results of the same deployment schedule would be for a contagion that followed the same pattern as polio.

But what about the subtler aspects of contagious outbreaks? Perhaps we know which demographics are most vulnerable, but do we know which are most likely to act as vectors? Maybe asking people to use hand sanitizer at a certain time of day (or at certain venues) could cripple the ability of some viruses to spread?

Or maybe there are certain places and times where STIs are most virulent, and people are missing key information on how to protect themselves? There are many possible insights that could be gained from developing a more complete picture of the behaviors of contagious diseases, and these are only the beginning. 

Starting

So how do we get started? I began by populating an interactive map of the united states with data on weekly infection reports of eight contagions in the United States during the past century. The map allows you to playback the progress of a particular outbreak as it spreads from state to state, and as it waxes and wanes in its severity. 

I also added automatically generated plots of infections by month and year, and a heatmap for viewing the entire historical trend.

This is very much a work in progress, but the current app gives a clear idea of what level of clarity we can achieve with the right data. County by county data exists for many demographic details throughout much of the United States. When coupled with contagion data of a similar resolution, it will be possible to observe the origins of the outbreak as well as the places where it is able to spread the fastest. By extracting correlation with demographic and other data, we may then be able to establish actionable insights that change the nature of the fight.

Whilst thinking small, it's easy to overlook the simple things we can do which could have a big impact, but when you consider that it's possible to save nearly 10,000 lives in a single year with a vaccine that reaches less than 50 percent of the population and that is effective only 38 percent of the time, it's easier to understand how even small things can save many lives.

 

See the app: Contagion Spread

 

 

About Author

Aaron Festinger

Aaron Festinger is a data scientist with a background in math, physics, and military special operations. He's interested in space travel, languages, and the mathematics of learning. In his free time he enjoys kayaking, brewing mead, and science...
View all posts by Aaron Festinger >

Leave a Comment

No comments found.

View Posts by Categories


Our Recent Popular Posts


View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI