Data Analysis on Covid 19: Flattening the Curve?

Posted on May 2, 2020
The skills I demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.

LinkedIn | Github

Inspiration and Goals

It goes without saying, that Covid-19 has rocked the world as hard as any event in modern human history, and because of this, I have been trying my best to keep track of the ever evolving situation in our country. During this time, I noticed many of the news networks focussing on the total number of national cases. As the numbers continued to rise, they started to lose meaning, particularly for individuals in states that have a lower case count. Furthermore, in mid-March I left New York City and went to Phoenix, Arizona to stay with my family. I was shocked at the difference in mentalities for people in each of these areas; in NYC, it seemed as though the world was ending, while in Phoenix, it seemed to be just business as usual. 

Even though the situation in Arizona rapidly deteriorated since my arrival, noticing these differences in human behavior got me asking questions. Was the situation really that different from place to place? Does the national case count accurately describe what is happening across America? If not, how can I accurately tell what is going on locally?

 

With these questions in mind, I decided to create a tool that people could use to track Covid-19, more specifically at the state level. The goal of this R Shiny app was three fold: 

  1. Personalization: I wanted the app to be focussed on state and local areas. Users would be able to visualize the coronavirus situation where they are. 
  2. Contextualization: The user would be able to compare between multiple different states. By providing relativity, the user can better understand the scope of the crisis in different regions.
  3. Simplicity: The app is smooth, functional, and easy to use so that the user can quickly get the information they need.

 

The Data

For this project I chose to use two datasets operated and updated daily by the Johns Hopkins University Center for Systems Science and Engineering (JHU CSSE). The first dataset I used contained state level data on a variety of variables including total case count, mortality rate, and testing rate. I used this data set to construct the interactive US geochart on the first page using GoogleVis. By hovering over the desired state, users can view up to date information on the number of confirmed cases.




The second data set I used contained time series data for every county in the United States going back to January. For the purpose of the project, I filtered the dates to start on March 1, because up till this point, many counties were not reporting data. This data was then manipulated in different ways to construct the remaining visuals within the State Level Analysis tab.

 

The Features

In addition to the National Overview on the first page, the bulk of the app functionality comes within the State Level Analysis tab. In the first subtab, the user is able to select a state to display the daily new cases. On the graph itself, the underlying bar chart shows the exact number of daily new cases reported each day, while the line shows the 5 day moving average. This feature was implemented due to the large variability in daily case reporting, and in turn creates a smooth representation of the growth trajectory. Additionally, the user has the option to add multiple states to the graph for comparison, while also adjusting the date slider to see how the trajectory has changed over time.  




In the second tab, the user is able to select a state, and then view county level data for that state through a density map, which gives even more granular data into what areas are most impacted by coronavirus. One of my favorite features in the density map is the ability to change the date slider, as this really enables you to visualize how the county level situation evolved over time. From a usability standpoint, this can give the user insights into if their county is at risk, or doing a good job stopping the spread. Additionally, it provides warning if for example neighboring counties start to show heightened case counts. 




Total Case Count

The two remaining tabs are designed to show total case count on both a linear scale and logarithmic scale. By using these two tabs in conjunction, the user can visualize whether or not the state is “flattening the curve”. Similarly to the Growth tab, users are also able to select multiple states to compare, while also adjusting the data slider to give them the time period they are interested in. At the time of writing this, a few states have started loosening their social distancing policies and opening businesses. Moving forward, I think it will be interesting to use these graphs to monitor whether or not there is a “re-steepening” of the curve in these states.

 

Going Forward

While the app is functional, there are more features that I would like to add to make the app even more useful. Firstly, I would like to add a “Recent News” tab, that allowed users to input their location, and then the app gave them recent Covid-19 news for their area. This would future help to get the most targeted information to users anywhere in the country. Secondly, I think creating more comparables within the app, to show coronavirus relative to something that people understand, would be helpful in changing human behavior because it would help people understand the data. For example, I could integrate car accident or seasonal flu data, to then compare Covid-19 mortality rate with these in different areas. I think this would open peoples eyes into how dangerous the spread of the virus actually is, and what they can expect within their local communities. 

 

About Author

Related Articles

Leave a Comment

No comments found.

View Posts by Categories


Our Recent Popular Posts


View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI