Webscraping running shoes portal runrepeat.com

Posted on Dec 15, 2017

Motivation

Whether you run for fitness or you are a marathon runner, finding the best-fitting shoe among the many choices at a running store isn’t always easy.

Research Questions

What are popular shoe brands?

What are the popular shoes for specific needs?

What features may have critical influences on customers satisfaction?

Data Collection

For my web scraping project I decided to scrape http://www.runrepeat.com, a running shoes discovery and review platform. It has over 134,867 expert reviews and over 1000 shoes for users to choose from.

In order to narrow down my research scope, I focused on the top women's running shoes in all categories. I was able to scrape 400+ shoes with top scores in terms of popularity and top reviews. For product datasets, I scraped brand name, shoe name, overall product rating, run score, rank, summary and reviews. Plus, the web scraping review dataset includes shoe details like terrain, use, release dates, score, reviews, review summary etc. My web scraping codes are available on Github.

Exploratory Data Analysis

 

  

 

Price Distribution

Rating vs Number of Reviews

Word Cloud of good reviews

Word Cloud of bad reviews

Source

http://www.runrepeat.com

About Author

Lalith Sugavanam

Lalith holds a Masters degree in computer applications from Bharathidasan University, India. She loves to program and has more recently progressed into a fascination with extracting meaning from data. She's currently pursuing a 12-week Data Science course at...
View all posts by Lalith Sugavanam >

Related Articles

Leave a Comment

Your email address will not be published. Required fields are marked *

No comments found.

View Posts by Categories


Our Recent Popular Posts


View Posts by Tags