Data Visualization of Cardiovascular Clinical Trials
The skills I demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.
Clinical research is medical data research involving human subjects to test the safety and the efficacy of a drug, therapy, or treatment in order to alleviate or cure a certain disease or injury. It is an essential part of our society and integral to developing scientific and medical breakthroughs. As a result of the implementation of clinical research studies, healthcare has only progressed, and life expectancy has been extended, especially with the application of modern technology.
One of the most common clinical trials is related to cardiovascular disease, the leading cause of death in the United States. In order to make it easier to view the data on cardiovascular clinical trials, I have developed an R Shiny Application. The purpose of this application is to visualize and provide useful information regarding these clinical trials and it may be useful for:
- Anyone or knows of anyone suffering from this disease and may potentially be interested in participating in these clinical trials.
- Anyone that has a family history of this disease and may want to get more information on what types of prevention methods are available.
- Medical researchers interested in the specific drug, therapy, or treatment utilized in the research study.
The dataset used was acquired from clinicaltrials.gov, a repository for clinical trials in the United States provided by the U.S National Library of Medicine. The dataset was filtered for cardiovascular related trials for the purpose of developing the Shiny Application.
Data Features & Insights
In the Shiny Application, there are 5 main tabs located on the left side bar of the main page. The first tab is the introduction, which states the background information and purpose of this application. It also displays brief videos on more information about clinical trials and cardiovascular disease.
Locations of Trials
The second tab displays a map of the specific locations of cardiovascular related clinical trials in the United States. The map includes, a clustering feature that allows the user to view the specific locations cluster as they zoom in and out of the map. When the user zooms in on the map, they can see the clusters split into sections that are smaller and more specific to their location. This is a neat feature that may be useful for anyone interested in the study who would like to locate if there are studies in the nearby area. It can be seen from the map that New York, California, and Texas are the regions with the most cardiovascular related clinical trials.
Different Studies of the Trials
The third tab, Information, displays bar charts of 4 different categories of these studies (Sponsor Type, Intervention Type, Patient Status, and Clinical Phase). Some visual insights gathered from looking the cardiovascular related clinical trials data are:
- They are mostly sponsored by Industry companies, such as pharmaceutical companies and Other, which are academic institutions and non-profit organizations.
- Drugs are the most common intervention followed by the use of a medical device.
- Currently there are 1,611 cardiovascular related trials that are recruiting patients.
- Most studies are currently in phase 2, which has an emphasis on the effectiveness of a certain drug or medical device. The goal of this phase is to gather preliminary data of the patient's progress and the effectiveness of the drug or device. This phase is also intended to gather data on the safety of the intervention such as short-term side effects.
In the fourth tab, Exploration, there is an interaction box plot enrollment number and duration of each trial for cardiovascular related clinical trials. These plots are an interactive feature that allows you to see the median, interquartile range, and minimum and maximum values. The median patient enrollment number for interventional studies is 535, while observation studies have a median value of 516. That indicates that both types of studies are pretty equal in terms of patient enrollment. Additionally, the median value of duration for interventional studies is 9 years, while observational studies tend to go on longer with a median duration of 11 years.
In the last tab, Data, presents a table of summary data by specific sponsors. The table includes features such as organization, total number of studies, average/minimum/maximum enrollment, and the length of study. When this table is filtered in descending order, we can see that hospitals, government organizations, and academic institutions have the highest number of cardiovascular related clinical trials, which is quite interesting to see. As for the sponsors, the medical company Abott has the highest number of cardiovascular related clinical trials.
It is certainly important for patients to have access to information regarding available clinical trials and have the opportunity to enroll in order to ultimately improve their overall quality of life. This application is a useful tool for anyone that may be potentially interested in participating in cardiovascular related clinical trials. This provides general information of what kind of trials are currently available by organization and is a great way to get started.
For the purpose of creating this application, the project was limited to visualizing cardiovascular related clinical trials data. However, given additional time and research I would explore further by pursuing these avenues:
- Extracting statistical inference and hypothesis testing to different variables, such as the age and race demographics to explore further insights
- Expand the scope of the application for leading causes of death, such as cancer, and explore the differences of the insights it provides
- Conducting additional research and validation of the accuracy of source data
- Collecting global data points to further validate the trends and insights found from the United States