Anti-epileptic Drug Review Analysis
Sometime when people have fever or headache or bad cold, they start to use tylenols. People might ask which symptom that the tylenols take more effect on when comparing with other medicines. In general, how do customers choose medications based on the reviews? How can we answer these questions? These questions can be done by analyzing the drug reviews. For this project, I am interested in analyzing the drug reviews for Epilepsy. Exploratory data analysis, Sentiment analysis, and relation between rating and polarity analysis will be presented in the project.
Epilepsy is a series of neurological disorders characterized by epileptic seizures and it is more common in older people. Epileptic seizures are episodes that can vary from nearly undetectable to long periods of epileptic seizures. The cause of most cases of epilepsy is unknown. Epilepsy can have both genetic and acquired causes. Seizures are often brought on by factors such as stress, alcohol abuse, or a lack of sleep. Nearly 80% of cases occur in the developing world. Seizures are controllable with medication in about 70% of cases. What are the effectivenesses of the anti-epileptic drugs? How do they treat people with various symptoms differently? Does the drug take more effect on a particular symptom? Pharmaceutical scientists might be interested in these questions so they can use the answers to do further improvement on drugs. Also, how do reviews help customers in choosing drugs? To answer these types of questions, we need to do an analysis on drug reviews.
www.Drugs.com has drug reviews information. For this project we selected 4 anti-epileptic drugs (Gabapentin, Clonazepam, Topiramate, Pregabalin) for the review analysis.
- Gabapentin: It is used in adults to treat nerve pain
- Clonazepam: It used to treat certain seizure disorders (including absence seizures or Lennox-Gastaut syndrome) in adults and children
- Topiramate: Help to reduce seizure activity and prevent migraine headaches from occurring
- Pregabalin: Slow down impulses in the brain that cause seizures
Below is a view on the data that we were interested in from the site.Each review of the customer includes condition, review comment, and rating. In the snapshot above, the information will be collected are condition (For Reflex Sympathetic Dystrophy Syndrome: ), Comment ("I was diagnosed with RSD on Tuesday. ... I like these pills so far!"), and Rating (10).
The method for scraping the page is BeautifulSoup.
The main item to be cleaned is the condition column. Condition: "For Reflex Sympathetic Dystrophy Syndrome:". The word 'For' in the front and ':' will be deleted. Also, if the condition is "Neurontin (gabapentin) for Postherpetic Neuralgia", "Neurontin (gabapentin) for " will be deleted.
Below is the head of the data frame.
Exploratory Data Analysis
Review Count By Condition
The majority of the reviews are from Pain and Anxiety for Gabapentin and follow by Fibromyalgia and Peripheral Neuropathy.
The majority of reviews are from Anxiety and Panic Disorder.
Migraine has more reviews than other conditions.
Fibromyalgia and Generalized Anxiety Disorder have more reviews.
Throughout this analysis we see which condition is used the most for the different drugs.
Average Rating By Condition
Cough has the highest rating among other condition.
Temporomandibular Joint Disorder, Chronic Myofascial Pin and Obsessive Compulsive Disorder have higher ratings.
Seizures has the highest rating follows by Diabetic Peripheral Neuropathy and Tourette's Syndrome.
Occipital Neuralgia has the highest rating, follows by Restless Legs Syndrome and Dercum's Disease.
Throughout this drug review analysis, we found Gabapentin is good for cough. Clonazepam is good for Temporomandibular Joint Disorder, Chronic Myofascial Pin and Obsessive Compulsive Disorder. Topiramate is good for Seizures and Pregabalin is good for Occipital Neuralgia. The pharmaceutical companies could look at the lower average rating conditions and do further analysis on the drug ingredients.
Average Rating Comparison
Clonazepam has the highest average rating among the drugs and Topiramate has the least average rating.
Clonazepam has the best polarity (more positive and less negative comments) among the other three drugs.
The review subjectivity scores are ranging from 0 to 1. 0 is being most objective and 1 is being most subjective. All drugs have the same structure, it is almost normally distributed. There is a small amount of reviews being more objective.
Relation Between Rating and Polarity
Do the review comments match up with the ratings? If there is a higher rating, do you think the comments will be positive? We will look at the scatter plots to make our conclusion.
It is surprise that we do not see there is a strong correlation between rating and review polarity but we do see higher rating corresponds to higher positive score. In the scatter plot of Gabapentin, there is a review with rating of 10 but there is a very negative review associated with it. The reason is the review mentions the customer was very ill before with 'miserable' on it and the other words are neutral. The majority of the reviews for each rating are between -.5 and .5 polarity. It is still quite normal as they talk about their illness in a negative way while mentioning about how do the drugs cure them in a positive way.
- The scatter plot for rating and polarity does not reflect there is a positive correlation between them. On average, the polarity is always a bit way from neutral in with positive or negative. We do see the polarity scores are higher for the ratings that are above 7.
- In terms of subjectivity, there is a small group of people with more objective in writing reviews. This makes the overall reviews be less bias.
- Based on the sentiment analysis, customers should choose Clonazepam out of the 4 drugs for this case study because it has the highest positive score and lowest negative score. In general, customers could compare more drugs by using this technique.