The Pains of Growing an e-Commerce Business: A Case Study on Etsy
Etsy, A Snapshot
Finding that special gift for your loved one, something handmade or vintage might send you Etsy. It's no surprise that this once micro eCommerce website for hobbyist with 650,000 members in 2008 had grown into 5 million in 2010, and to 54 million in 2014. Rob Kalin, its founder, accidentally learned how to make a website to pay for his rent. He, himself, enjoyed creating things which inspired him to develop a website where other artists like him could sell their work.
Data Gathering Through Web Scraping: Baby Carriers Category
To get a good sense of the kind of customers and sellers in Etsy, I picked a specific product category to analyze. In this case, it was baby carriers. Using Python's Scrapy, I gathered the following information from the eCommerce site:
- Product Name
- Product Price
- Product Views
- Seller Name
- Seller Rating
- Seller Location
- Seller Items
This yielded to 28,143 observation and 7 features available for analysis. After pre-processing the data, I performed graphical and numerical exploratory data analysis using R. What follows are my initial findings.
Visitors of Etsy Drawn to Lower-Priced Products
For the baby carrier product category, it was interesting to find out that the most viewed were the ones priced $25 - $50. Something that I did not expect from a handmade and vintage marketplace.
High Volume of Lower-Cost Product Inventory
Product inventory price range for baby carriers appears to be leaning towards lower-end with items $25 and below accounting for the bulk followed by $25-$50 range.
Predictive Model To Forecast Product Demand: Linear Regression
I wanted to create a model that predicted product demand using product sales volume and views. However, on the Etsy website, it only published total shop sales with no breakdown per specific product. As a substitute, I used the product views to estimate product demand.
Upon performing the linear regression diagnostics, the summary of results were as follows:
- there were no significant CORR among variables
- the samples were not drawn from a normal distribution
- the input variables were not independent from each other
- there was no linear relationship shown in the scatter plot
- therefore, Linear Regression not suitable for prediction of views using feedback, price, item count
- the data shows interest for more affordable baby carriers
- proliferation of handmade baby carriers with lower than $50 price range begs the question, are they really truly handmade?
- predicting product views using linear regression is not suitable for Etsy’s baby carrier category based on data scraped
- more success can lead a business to moving away from its original brand identity and values. Whether it is better or not is yet to be investigated based on agreed KPIs.
- Etsy seems to have changed from niched to mass-market patterns
- Scrape reviews and perform text analysis
- Product name analysis
- Shop Inventory analysis
- Shop location analysis
- Product pricing recommendation for sellers
- Shiny app