Webscraping running shoes portal runrepeat.com

Posted on December 15, 2017

Motivation

Whether you run for fitness or you are a marathon runner, finding the best-fitting shoe among the many choices at a running store isn’t always easy.

Research Questions

What are popular shoe brands?

What are the popular shoes for specific needs?

What features may have critical influences on customers satisfaction?

Data Collection

For my web scraping project I decided to scrape http://www.runrepeat.com, a running shoes discovery and review platform. It has over 134,867 expert reviews and over 1000 shoes for users to choose from.

In order to narrow down my research scope, I focused on the top women's running shoes in all categories. I was able to scrape 400+ shoes with top scores in terms of popularity and top reviews. For product datasets, I scraped brand name, shoe name, overall product rating, run score, rank, summary and reviews. Plus, the web scraping review dataset includes shoe details like terrain, use, release dates, score, reviews, review summary etc. My web scraping codes are available on Github.

Exploratory Data Analysis

 

  

 

Price Distribution

Rating vs Number of Reviews

Word Cloud of good reviews

Word Cloud of bad reviews

Source

http://www.runrepeat.com


About Author

Lalith Sugavanam

Lalith holds a Masters degree in computer applications from Bharathidasan University, India. She loves to program and has more recently progressed into a fascination with extracting meaning from data. She's currently pursuing a 12-week Data Science course at...
Read more

Leave Responses

Your email address will not be published. Required fields are marked *

Toko / Jual / Grosir Sepatu Online Murah January 8, 2018
I do agreee with all the ideas you've offered in your post. They are very convincing and can definitely work. Nonetheless, the posts are tooo short for starters. May you please lengthen them a little from next time? Thank you for the post.