Visualizing Clinical Trial Operations in Breast Cancer and Prostate Cancer

Posted on Feb 5, 2018


Clinical trials are experiments or observations done in clinical research. Clinical Trials are designed for participants to participate in the medical, observational or behavioral interventions. Clinical trials involve the testing of investigations including new treatments such as novel vaccines, drugs, dietary supplements, and medical devices that warrant further studies to find out better ways to treat, and understand the progress of disease. The primary purpose of doing clinical trials is to gain more information about the risks and effectiveness of an experimental treatment in humans. Different types of people participate in clinical trials. Some are healthy, some may have illnesses. They both play important roles in clinical trials. 

People might ask: " Are the clinical trials safe? "  As with any type of medical care or therapy of daily living, clinical trials can have risks. If you want to join a clinical trial, the staff will describe the risks that may cause to the participants. You can weigh the risk factors and decide whether or not you would like to participate.

Business Motivations

The process of developing a new drug from the original idea to launch a finished product often takes 12–15 years. The drug discovery begins in the laboratory. When the researchers create a new therapy or drug, it is tested in the laboratory and on animals at first. If the initial lab research is successful, researchers will send the data to the Food and Drug Administration (FDA) to get approval to continue testing on human beings. Depending on product types and development stages, investigators enroll volunteers into small group of studies, and conduct larger scale studies progressively. Because Breast cancer rates the highest in female and prostate cancer rates the highest in male, I chose to analyze these two cancers in clinical trials.


I got the data from The dataset contains:

  • The number of recruitment of breast cancer and prostate cancer
  • Locations, latitude, longitude
  • The number of total trials in breast cancer and prostate cancer
  • Phases
  • Durations of the clinical trials
  • Sponsor.Collaborators
  • Study types
  • Rate of recruitment

Shiny App Analysis

The map below shows the concentrate of  total clinical trials around the world. From the legend we can see that the darker the color on the map, the more trials are recruiting, or already completed. The map shows that the United States, Europe and China have the most clinical trials in breast and prostate cancer compared to other countries.

These two maps below show the clinical trials in both breast and prostate cancer that are recruiting now around the world. When you room in it, you can see the trials at specific cities and locations. 


I created the histograms that show the number of recruitments(active but not ready to recruiting, recruiting, completed and etc), phases, study types and top 10 sponsor collaborators. We can see the different counts between the two cancers clearly by looking at the histograms.


From the two histograms below, we can see that the three biggest cancer centers sponsor the most trials both in breast and prostate cancer. The pharmaceutical companies Roche ranks 4th in breast cancer and AstraZeneca ranks the 6th in Prostate cancer.

There are four phases in clinical trials:

Phase I assesses the safety of a drug or device.

Phase II tests the efficacy of a drug or device.

Phase III involves randomized and blind testing in hundreds to thousands patients.

Phase IV is often called Post Marketing Surveillance Trials that are conducted after a drug or device has been approved for consumer sale.

You can see that there is a large number of NA in the two histograms above, that is because some companies or research centers didn't report the integrated report to the

Rate of Recruitment

The boxplot above shows how fast the different sponsor collaborators recruit people for the two cancers. ROR means the rate of recruitment.

ROR = total amount of volunteers/ total number of months.

From the boxplot above, we can see that the rate of recruitment of majority pharmaceutical companies are faster than the academic institutes. In breast cancer, Bayer ranks the 1st in ROR. In prostate cancer, Ferring ranks the 1st. 

The boxplot above shows how fast the people get recruited in different phases. Breast cancer is the fastest in Phase2&3, and prostate cancer is the fastest in Phase 4. 


Clinical trials has become the hottest topic in the healthcare industry. The pharmaceutical, biotech and medical devices companies, even some governmental organizations spend millions on clinical trials every year. It uses the big data to analyze and recruit people in different types of experiments to help the industries reduce the cost and improve the efficiency of the recruitment. My Shiny app is available here.  Source code is available at Github.


About Author

Xiao Jia

Xiao received a MS degree in Biomedical Informatics from Nova Southeastern University in Florida. She was working as a data analyst at a healthcare IT company in Fort Lauderdale, where she developed her passion and got to know...
View all posts by Xiao Jia >

Related Articles

Leave a Comment

Google March 19, 2020
Google Just beneath, are various completely not connected web pages to ours, having said that, they are certainly worth going over.
Google January 22, 2020
Google One of our guests lately suggested the following website.

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 airbnb Alex Baransky alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus API Application artist aws beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep Bundles California Cancer Research capstone Career Career Day citibike clustering Coding Course Demo Course Report D3.js data Data Analyst data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization Deep Learning Demo Day Discount dplyr employer networking feature engineering Finance Financial Data Science Flask gbm Get Hired ggplot2 googleVis Hadoop higgs boson Hiring hiring partner events Hiring Partners Industry Experts Instructor Blog Instructor Interview Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter lasso regression Lead Data Scienctist Lead Data Scientist leaflet linear regression Logistic Regression machine learning Maps matplotlib Medical Research Meet the team meetup Networking neural network Neural networks New Courses nlp NYC NYC Data Science nyc data science academy NYC Open Data NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time Portfolio Development prediction Prework Programming PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn Selenium sentiment analysis Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau team TensorFlow Testimonial tf-idf Top Data Science Bootcamp twitter visualization web scraping Weekend Course What to expect word cloud word2vec XGBoost yelp