The Pains of Growing an e-Commerce Business: A Case Study on Etsy

Chris Valle
Posted on Nov 21, 2016

Etsy, A Snapshot

Finding that special gift for your loved one, something handmade or vintage might send you Etsy. It's no surprise that this once micro eCommerce website for hobbyist with 650,000 members in 2008 had grown into 5 million in 2010, and to 54 million in 2014. Rob Kalin, its founder, accidentally learned how to make a website to pay for his rent. He, himself, enjoyed creating things which inspired him to develop a website where other artists like him could sell their work.

 

Data Gathering Through Web Scraping: Baby Carriers Category

To get a good sense of the kind of customers and sellers in Etsy, I picked a specific product category to analyze. In this case, it was baby carriers. Using Python's Scrapy, I gathered the following information from the eCommerce site:

  1. Product Name
  2. Product Price
  3. Product Views
  4. Seller Name
  5. Seller Rating
  6. Seller Location
  7. Seller Items

This yielded to 28,143 observation and 7 features available for analysis. After pre-processing the data, I performed graphical and numerical exploratory data analysis using R. What follows are my initial findings.

 

 

Visitors of Etsy Drawn to Lower-Priced Products

For the baby carrier product category,  it was interesting to find out that the most viewed were the ones priced $25 - $50. Something that I did not expect from a handmade and vintage marketplace.

pricerangeviews

 

 

 High Volume of Lower-Cost Product Inventory

Product inventory price range for baby carriers appears to be leaning towards lower-end with items $25 and below accounting for the bulk followed by $25-$50 range.

pricerangecount

 

Predictive Model To Forecast Product Demand: Linear Regression

I wanted to create a model that predicted product demand using product sales volume and views. However, on the Etsy website, it only published total shop sales with no breakdown per specific product. As a substitute, I used the product views to estimate product demand.

Upon performing the linear regression diagnostics, the summary of results were as follows:

  • there were no significant CORR among variables
  • the samples were not drawn from a normal distribution
  • the input variables were not independent from each other
  • there was no linear relationship shown in the scatter plot
  • therefore, Linear Regression not suitable for prediction of views using feedback, price, item count

 

Takeaways

  • the data shows interest for more affordable baby carriers
  • proliferation of handmade baby carriers with lower than $50 price range begs the question, are they really truly handmade? 
  • predicting product views using linear regression is not suitable for Etsy’s baby carrier category based on data scraped 
  • more success can lead a business to moving away from its original brand identity and values. Whether it is better or not is yet to be investigated based on agreed KPIs.
  • Etsy seems to have changed from niched to mass-market patterns 

Next Steps

  • Scrape reviews and perform text analysis
  • Product name analysis
  • Shop Inventory analysis
  • Shop location analysis
  • Product pricing recommendation for sellers
  • Shiny app

About Author

Chris Valle

Chris Valle

Chris is a Digital Strategy Manager and Marketer who, for 10 years, has been combining her data-driven insights and customer-centric marketing strategies to grow her clients' business. Her forte is monetizing digital and mobile channels to drive international...
View all posts by Chris Valle >

Related Articles

Leave a Comment

No comments found.

View Posts by Categories


Our Recent Popular Posts


View Posts by Tags

2019 airbnb Alex Baransky alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus API Application artist aws beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep Bundles California Cancer Research capstone Career Career Day citibike clustering Coding Course Demo Course Report D3.js data Data Analyst data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization Deep Learning Demo Classes Demo Day Demo Lesson Discount dplyr employer networking feature engineering Finance Financial Data Science Flask gbm Get Hired ggplot2 googleVis Hadoop higgs boson Hiring hiring partner events Hiring Partners Industry Experts Instructor Blog Instructor Interview Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter lasso regression Lead Data Scienctist Lead Data Scientist leaflet Lectures linear regression Live Chat Live Online Bootcamp Logistic Regression machine learning Maps matplotlib Medical Research Meet the team meetup Networking neural network Neural networks New Courses nlp NYC NYC Data Science nyc data science academy NYC Open Data NYCDSA NYCDSA Alumni Online Online Bootcamp Online Lectures Online Training Open Data painter pandas Part-time Portfolio Development prediction Prework Programming PwC python python machine learning python scrapy python web scraping python webscraping Python Workshop R R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking Realtime Interaction recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn Selenium sentiment analysis Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau team TensorFlow Testimonial tf-idf Top Data Science Bootcamp twitter visualization web scraping Weekend Course What to expect word cloud word2vec XGBoost yelp