Novel Drug Data Trends Approved by FDA Since 2015

Posted on Aug 4, 2021
The skills I demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.



FDA's Center for Drug Evaluation and Research (CDER) ensures that safe and effective drugs are available to the people of USA. Data shows that all pharmaceutical companies spend millions of dollars for the discovery of the drug molecule and go through rigorous approval process to sell the drugs in the market. The application of new drugs are divided in four phases.

After the assessment of the effectiveness and safety of the different volunteers at each phase(I-III), reviewing the labelling details of the drugs and the inspection of manufacture's site by CDER,  finally approval is given for the sale of the drug in the market. Phase IV is to check any side effects after the drug is consumed by the general public. If there is any adverse effect of the drug on general population,  the drug is recalled from the market.


The web scraping project was aimed to scrape all the drugs from 2015 till early 2021(March 10, 2021) using Scrapy. Three more columns were introduced manually describing the medical area, more specific human system the drug targets and gender for which the drugs were designed.

The objectives of the study were

  • To see the trends of drugs approved by FDA
  • Which medical areas are mostly represented?
  • What is the dosage form of the drugs?
  • What are major companies with maximum drug approvals?

Data Analysis

285 drugs approved since 2015 till March 10, 2021 and the details submitted in the application were scraped from FDA website. To see the trend of the novel drugs,  the number of drugs approved by FDA per year since 2015 was studied. 2018 being the most successful year with 59 drugs approval while 2016 had least number of drugs approval (23). Till March 10, 2021, 12 drugs were approved for the year of 2021.

Novel Drug Data Trends Approved by FDA Since 2015

Next objective was to find what are the different medical areas these drugs represent.  The top 3 medical areas where most of the drugs were approved are 1) Cancer (88 drugs or 31% of the total approved drugs) 2) Cardiovascular department (26 drugs or 9.2% of total approved drugs. 3) Neurology have 23 approved drugs or 8.2% of the total drugs approved.

Novel Drug Data Trends Approved by FDA Since 2015

To see the trend, the year wise breakdown of approved drugs with respect to the medical areas were studied. The cancer drugs have been approved in all 5 years. 2015 was a good year for mental health drugs and cardiovascular drugs. 2016 since has the least drug approval have mostly drugs from cancer, muscular and hepatic areas. 2018 , 2019 and 2020 showed the most range of approved drugs. In 2020 drugs for two deadly outbreak were approved – Ebola and COVID -19.

Novel Drug Data Trends Approved by FDA Since 2015

Data on Dosage Form

There are different dosage form of drugs which decides how the drug is administered and absorbed in our body. The common different dosage form are tablet, capsule, injectable, lotion, emulsion, cream etc. Most of the drugs approved are tablets as they are most patience compliance.  Injectables is the second highest most common dosage form. Solutions are very common for imaging especially among cancer patients.

Review Priority for approved drugs

The companies seeking approval for novel drug specify the category of drug under which they seek approval like Orphan drug category for rare disease or if other treatments for that disease is not working. Other category are priority and standard. Most of the drugs approved in last 5 years are under orphan category.

Cancer Drugs Data

Most of the drugs approved are cancer drugs. According to CDC, the second leading cause of deaths in US are cancer after heart related diseases. A total of 285 drugs from 194 different companies were approved in last five years. Maximum drugs approved were for blood cancer followed by drugs related to lung and thyroid cancer. Lung cancer one of the common cancer in US. The correct location of tumor and the stage of the tumor plays an important role in treatment of cancer. Six drugs related to imaging for cancer location were approved.

The dosage for cancer drugs were mostly tablet followed by almost equally by capsule and injectable.

A deeper insight of the dosage form and different kind of cancer drugs threw some detailed insights. Lung, nephrology and blood cancer drugs represented all the dosage form. Bone and gastro-pancreatic cancer drugs were in solution form only. Drugs for cancer imaging is also in solution form Prostate cancer drug were all in tablet form. Skin cancer mostly in tablet forms only one in each injectable and capsule.

Cancer is a very complex disease and in many cases controlled by more than one gene . In many cases the treatment for cancer stops working on some patients or the patients have special mutation or condition that make standard cancer treatment ineffective. Hence many approved cancer drugs were under orphan category as compared to drugs related to other disease.

The gender study for cancer drugs showed that most of the cancer drugs are mostly designed for adults. Only 6 drugs were designed for children only. Though cancer is less common in children but it is estimated that in 2021, 10,500 cases will be diagnosed with cancer and around 1190 will die of the disease(

Top pharmaceutical companies

Drug manufacturers have to put a huge investment and tons of research and trial to make a successful drug.  The study the companies research interest in different drugs discovery gives a competitive edge to other pharmaceutical companies. Since 2015 , 285 unique drugs from 193 companies got approval. 43 out of 193 companies got approval for more than 1 drug. 11 companies got approval for 3 drugs. Six companies got approval for more than 5 drugs in last five years

1.Novartis (10)

2.Genentech Inc. (7)

3.Eli Lilly and Co (6)

4. Abbvie Inc. (5)

5.Gilead Sciences Inc. (5)

6.Merck sharp Dohme (5)

It was interesting to find the different research interest for top 5 companies in terms of drug discovery in the past 5 years. Cancer is the preferred domain for Novartis and Genetech Inc Gilead is more focused on HIV, Hepatitis C and COVID-19. Gilead was the only company till March 2021 which had come up with COVID-19 medicine. Cancer, Immunology and Neurology drugs were dominant in Eli Lilly & Co

Novartis is the top company with 10 drugs approval in past 5 years. This a huge achievement for a company which also reflects the wide research sources and competitive team. 2019 was the most successful for the company with 5 drugs approved in a single year.


The study was aimed to see the trend of drugs approved in last 5 years and to get research interest of top companies to understand the market. To summarize the findings

  • 285 novel drugs were approved by FDA since 2015
  • Mostly drugs related to cancer treatments were approved with significant number of drugs for blood cancer (highest market)
  • Tablet dosage form is the most common form of dosage form used by companies (most patience compliance)
  • Mostly cancer drugs were approved under orphan which is drugs for rare diseases
  • Novartis had the greatest number of drug approvals in last five years and drugs represents wide range of medical departments

Interesting Facts

  • Smallpox is eradicated from world but Tecovirimat under the brand name Tpoxx by Siga technologies is manufactured in millions to use in case of bioweapon war
  • Renal or nephrology drugs are also approved in significant amount. However the top 5 companies have no drugs approval for nephrology.
  • Drugs for infants or children were mostly rare genetic disease like Progeria, Duchenne muscular dystrophy
  • Gilead was the first company to have approved drug for COVID treatment

About Author

Tania Ghosh

Tania Ghosh is PH.D in Genetics and Forensic Science. Having research experience of 5+ years in the field of molecular biology and Genetics. Currently shifting aim to the world of Data Science. Tania Ghosh could be reached at...
View all posts by Tania Ghosh >

Related Articles

Leave a Comment

No comments found.

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI