Web Scraping to Visualize Trends in Deals Using Data
Motivation:
In a world full of deals and coupons, have you ever wondered which deals are actually good deals?
Anyone familiar with consumer psychology can tell you that people love deals. Those huge, red signs sayingย "30% off" or "Buy 1 Get 1 Free" are very attractive to consumers. So much soย thatย manyย companies are having sale items all year round.
This raises the question about theย quality of these deals.ย Doย these deals exist because the itemsย have poorer quality (e.g. a jacket with a scratch on the back)? Do they exist because the functionality is obsolete (e.g floppy disks)? Do they exist because the inventory is low or the item is out of season? Whatever the reason may be, there is always a reason. Theย interesting question is,ย is this deal a good deal and will it save me money.
About dealmoon.com:
Dealmoon.com is very similar to groupon.com, where it gathers information of deals and coupons from merchants in the U.S., and groups them intoย different categories (e.g. Clothing, Electronics, Baby, etc.). All information are available on their website for free.
Web Scraping:
I used theย Selenium package in Python to scrape all data.
Some logistics about the data I scraped:
-
- Total ofย ~45,000ย deals fromย 8ย categories (i.e.ย Clothing, Beauty, Nutrition, Baby, Home, Electronics, Travel, Financeย )
- Total ofย 6ย attributes (i.e.ย category of deal, deal title, deal description, posted time, number of comments, number of bookmarks)
- The entire crawling process took ~6hrs.
Visualisations:
-
What are the popular deals?
For me, when I try to find good deals I always check the popular dealsโโdeals with a lot ofย bookmarks and comments. My rationale is that if a deal has highย popularity, it must be good; the chance ofย a group of people bookmarking a bad deal is low. Under this assumption, I first explored the popular deals.
To take into consideration that maybe not everyone defines popularity the same way, the App allowsย theย usersย to define "popularity" by whichever metric they like: the number of bookmarks, the number of comments, or both.
-
Which stores always have good deals?
By now, you know enough about the functionality of this app to explore this topic on your own, my dear reader. Find the link to the app at the end of this post, and find out which stores always have good deals. You may be surprised!
-
When are there most deals?
Future Directions:
All the above-mentioned visualisations will help us understand which deals are good deals and which stores always have good deals. However, oneย drawback of this analysis is thatย they are all post-hoc analysesโโthey will only inform users which deals they should take advantage of AFTER other users have used the deal. By then, it may be too late: the deal is no longer valid or the item has been sold out.ย Therefore,ย in order to fully take advantage of past deals, one approach is to use Natural Language Processing to extract patterns of previous good deals to help classify new deals to be good or bad in real time.
The patterns may beย the type of deal (e.g. 'Buy 1 get 1 free', 'Free shipping for orders over $100'), the duration of the deal (e.g. 'Today only', 'Valid for this weekend'), deducted percentageย (e.g. '$50, originally $100', ' $100, originally $250').