Data Study on Most Common Drugs on Webmd
The skills the author demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.
Introduction
WebMD is a website that people use to read health information and data, including drugs they are interested in learning about. Among its features is a listing of the most common drugs people search. Each of the links to the drugs includes reviews that users have given the drugs with information like the condition the drug is for, ratings on the drug, as well as vital statistics on the user like age, sex, and how long they used the drug.
Extracting Data from Webmd
I used Scrapy to scrape all the reviews from the most common drugs listed on WebMD. I created a Shiny app to visualize the data that was scraped from the site.
Interactive Shiny web data application
The Shiny app has four tabs: an introduction, a preview of the scraped data, a tab called "By Drug," and a tab called "By Condition."
On the tab called "By Drug," one can select a drug from the dropdown menu. After choosing a drug, the app will output a bar graph that contains the counts of ease of use ratings, effectiveness ratings, and satisfaction ratings. Below this bar graph is another graph that depicts how long reviewers used the drug. This tab also includes the average ease of use, effectiveness, and satisfaction ratings by drug.
The next tab is called "By Condition." On this tab, one can select a medical condition from the drop down menu. Based on the condition selected, new options in the "Select Drug" drop down menu will appear. These drugs listed in the drop down menu are those that treat the condition selected. After selecting a drug based on the condition, the app will show a bar graph containing the counts of ease of use, effectiveness, and satisfaction reviews.
Future Work
In the future, I wish to use age and sex as a factor in viewing the ratings. Additionally, I could add a word cloud with the most common words said in the comments for each category selected. Additionally, in the future, I could scrape other data for each drug, such as uses, side effects, how to use, how long it takes for cone to get the full benefit from the drug, interactions with other drugs, and allergies. Lastly, I could find the prices for the drugs from pharmacies in the area.
Link to GitHub repository and Shiny app:
https://github.com/anishaluthra/webmd
https://anishaluthra.shinyapps.io/WebmdShinyApp/