Analyzing Data to Track the COVID-19
The skills the author demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.
A new coronavirus, SARS-CoV-2, emerged in Wuhan, China in late 2019. With more cases popping up every day, there was major concern that the coronavirus would evolve into a deadly pandemic. I developed an R shiny app in January 2020 to track COVID-19 cases and compare patient outcomes to previous viral outbreaks. This app updates daily with the most recent numbers and data from the CSSE group at Johns Hopkins University.
What's in a name?
Coronavirus. 2019-nCoV. SARS-CoV-2. COVID-19. Many names and acronyms are being used to talk about this virus which can be very confusing. This is partly because it took some time for scientists to agree on what to call this brand new virus. The other reason for these names is that they refer to things other than the virus itself. Here’s a little primer on these biological terms.
Coronavirus is a group or family of similar viruses. This family gets its name from their shape which resembles a crown, or “corona” in Latin. Several different viruses are in the coronavirus family, some of which are relatively harmless.
SARS-CoV-2 is the actual virus that is currently infecting people across the globe. This virus is a newly discovered member of the coronavirus family. Currently, scientists think SARS-CoV-2 spreads through liquid droplets, spread by close human contact. Washing your hands for 20 seconds and practicing social distancing are the best current methods to prevent the spread of this virus.
COVID-19, or coronavirus disease 2019, is the respiratory disease that a person gets after being infected by SARS-CoV-2. If you get infected with the virus, you may not necessarily get COVID-19. Many infected people do not show symptoms but can still spread the virus. This is why social distancing is so important! You could infect others even if you feel perfectly fine!
Where is COVID-19 now?
The virus first infected people in the Hubei province in China in late 2019. Cases were largely restricted to China until late February 2020. Despite having months to prepare for the spread of SARS-CoV-2, countries around the world failed to contain the spread of the virus and it evolved into a pandemic in March 2020. Infections often follow a sigmoid curve where there is a phase of exponential growth followed a tapering off of infections. Currently, we are still in the exponential phase. If the number of daily cases starts to decrease, we will know that we are starting to contain the virus.
Current hotspots for COVID-19 include Spain, Italy, and the United States which each have over 100,000 cases. Germany also has a high number of cases but their mortality rate is quite low, around 1.5%. Other countries like South Korea, have had remarkable success in containing the virus.
Data on COVID Numbers
As of April 5th, 2020, 1.2 million people have been infected with SARS-CoV-2 and nearly 70,000 people have died. The apparent mortality rate is about 5% but this number is likely to be overestimated. Because of a lack of testing resources, the true number of infected people is very likely to be much higher. Professional epidemiologists estimate the like mortality rate to be around 1%.
Data Comparison with Other Viruses
COVID-19 has both infected and killed more people than the 2003 SARS epidemic. However the mortality rate for SARS was around 10%, at least twice that of COVID-19 and likely 10 times more using more rigorous estimates. These viruses are both in the coronavirus family, but you can see they impact people very differently.
How about this year's flu? This year's flu has infected about 31 million people, or 30 times the number people with COVID-19. However, COVID-19 has killed more people than the flu and has a much higher mortality rate. The flu's mortality rate is only 0.1% while COVID-19 ranges from 1-5%. Both of these diseases are serious health threats. Neither should be discounted in terms of the cost of human life.
The 1918 Influenza Pandemic infected an estimated 500 million people and killed about 50 million, or 10%. The COVID-19 numbers are lower than this previous pandemic, but we should be cautious as the numbers are rapidly accelerating every day. The WHO has officially declared COVID-19 a new pandemic and we must do what we can to "flatten the curve" and lower the number of cases.
Lastly, how does COVID-19 compare to Ebola, one of the scariest viruses known to man? COVID-19 has definitely infected and killed more people than the Ebola outbreak of 2014, but the mortality rate here is perhaps the most important. Ebola has a mortality rate of 62% meaning that 3 out of every 5 people infected with Ebola died. COVID-19 is far less fatal than Ebola which is somewhat reassuring.