Data Science in Drug Discovery Biological Characteristics

Layal Hammad

Posted on Aug 21, 2022

The skills the authors demonstrated here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.

Introduction

Drug Development Process Data

Owing to the better understanding of biological characteristics of various diseases and due to technological advances in drug discovery, biological targets and drug candidates identification are becoming less challenging. Drug Development process is highly time-consuming, as it takes on average 12-15 years for a new medication to be approved for use by the FDA.

In the early stage of Drug Discovery, thousands of chemical compounds are tested against multiple biological targets through automated High-Throughput Screening. Hits, compounds that show activity to a certain target, are then studied further. Studying some basic benchmarks of drug-likeness, such as the Lipinski, are essential for proving hits potential. Lipinski Rule of 5 is used as a rule of thumb to indicate how drug’s properties, in terms of size, lipophilicity, and intermolecular attraction, are affecting its absorption, distribution, metabolism, and excretion from a human body.

In this project, I explored the bioactivity libraries of Benzodiazepine family in order to find similar biological activity among compounds. Moreover, I studied parameters that significantly contribute to compounds sharing similar bioactivity.

Methodology

Data

All Bioactivity profiles for compounds related to Diazepam and Alprazolam were downloaded from PubChem Library using Selenium Package on Python. Out of 2800 compounds, only 389 compounds had biological test results, and only 68 compounds were studied on more than 50 biological tests.

Using Pandas and rdkit Packages, Data was then analyzed using two different approaches:

Compound Based Approach: Selecting compounds that were tested on similar bioassays only
Target Based Approach: Selecting a bioassay with the maximum number of hits

Results

Compound Based Approach

Only 49 compounds were found to be tested on maximum number of shared bioassays (114 shared Bioassays). 15 compounds showed activity on 13 bioassays, and only two of them were having activity on the same bioassay.

Target Based Approach

It was found that “qHTS for Inhibitors of human tyrosyl-DNA phosphodiesterase 1 (TDP1): qHTS in cells in absence of CPT” (AID: 686978) has shown the highest number of hits, i.e. 12 out 50 compounds showed activity for this bioassay.

In an attempt to study the drug-likeness of all the compounds that was tested on AID: 686978 bioassay, Rule of Five - Lipinski Parameters were used for assessment to show that Molecular weight and Hydrogen donors did not play a significant role on determining activity.

Conclusion

Although, data did not show similarities in biological activity, the results showed a similar biological behavior which makes benzodiazepine a High quality core structure.

LogP and Number of Hydrogen Acceptors in compounds played a significant role in determining hits toward the target.

About Author

Layal Hammad

Having a 6-year experience at a medical distribution company, Dimensions Healthcare Company, gave me the opportunity to be exposed to all value chain elements in the medical industry and opened my eyes to various obstacles and challenges that...

View all posts by Layal Hammad >

No comments found.

Data Science in Drug Discovery Biological Characteristics

The skills the authors demonstrated here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.

Introduction

Methodology

Data

Results

Compound Based Approach

Target Based Approach

Conclusion

About Author

Layal Hammad

Leave a Comment

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

NYC Data Science Academy

Get detailed curriculum information about our
amazing bootcamp!

Offerings

About

SOCIAL MEDIA

Data Science in Drug Discovery Biological Characteristics

The skills the authors demonstrated here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.

Introduction

Methodology

Data

Results

Compound Based Approach

Target Based Approach

Conclusion

About Author

Layal Hammad

Leave a Comment

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

NYC Data Science Academy

Get detailed curriculum information about our amazing bootcamp!

Offerings

About

SOCIAL MEDIA

Get detailed curriculum information about our
amazing bootcamp!