Drawing the Borders of Olfactory Space

Wendy Yu

Posted on Mar 15, 2016

Contributed by Wendy Yu. She is currently in the NYC Data Science Academy 12 week full time Data Science Bootcamp program taking place between January 11th to April 1st, 2016. This post is based on her previous project - R visualization.

Drawing the Borders of Olfactory Space

Chung Wen Yu, Katharine Prokop-Prigge, Lindsay Warrenburg, and Joel Mainland

37th Annual Conference of Association for Chemoreception Science

A common refrain in the olfactory literature is that humans can detect 10,000 different odorants, however both the source and quality of this estimate is unclear. Here we set out the answer to this question quantitatively. We developed machine-learning models that can distinguish odorous from odorless compounds based on their physicochemical properties. Machine-learning algorithms used include logistic regression, random forest, support vector machine, and gradient boosting. In cross validation, our best performing model had 94% accuracy and AUC of 0.96. To further test this model, we asked 15 participants to distinguish test molecules from blank jars using five alternative forced choice tests for each compound. In this external validation, our model could distinguish between odorous and odorless molecules with 72% accuracy and AUC of 0.82.

Next, we applied the model to the Chemical Universe Database, a collection of 166 billion molecules that are both chemically stale and synthetically feasible with up to 17 atoms of carbon, hydrogen, nitrogen, oxygen, sulfur or halogens. Since existing catalogs of odorous molecules rarely contain compounds with more than 21 heavy atoms, we then extrapolated the result to 21 heavy atoms. We estimate that there are approximately 2.7 trillion molecules with 21 or fewer heavy atoms. We predict that over 27 billion of these 2.7 trillion molecules will have an odor. Our findings define the borders of olfactory space, and enables rational sampling of all volatile compounds. Such a set can be applied to build desirable odor screening panels that will facilitate research in the field of olfaction.

About Author

Wendy Yu

As a biologist, Wendy believes in evidence-base analysis, and is passionate about data. Wendy graduated from the University of Pennsylvania in 2013 with a Masters in Biotechnology. While pursuing a career as a biologist Wendy quickly realized that...

View all posts by Wendy Yu >

R Shiny Shows Decline in Even Strongest Democracies

Data Visualization

Python Shows Factors Influencing University Retention Rates

R Shiny

R Shiny: Downstream Processing Dashboard

Python

Data Analysis on Car Accidents in the US

Beijing and its Air Quality

Cancel reply

You must be logged in to post a comment.

No comments found.

Drawing the Borders of Olfactory Space

Drawing the Borders of Olfactory Space

About Author

Wendy Yu

Related Articles

Leave a Comment

Cancel reply

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

NYC Data Science Academy

Get detailed curriculum information about our
amazing bootcamp!

Offerings

About

SOCIAL MEDIA

Drawing the Borders of Olfactory Space

Drawing the Borders of Olfactory Space

About Author

Wendy Yu

Related Articles

Leave a Comment

Cancel reply

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

NYC Data Science Academy

Get detailed curriculum information about our amazing bootcamp!

Offerings

About

SOCIAL MEDIA

Get detailed curriculum information about our
amazing bootcamp!