The Science of Binge-watching: Netflix
Contributed by Wanda Wang. She is currently in the NYC Data Science Academy 12 week full time Data Science Bootcamp program taking place between April 11th to July 1st, 2016. This post is based on her third class project - Python Web Scraping (due on the 6th week of the program).
When were you "hooked" on House of Cards? The 3rd episode? When did you stop? (Was it when the pet dog passed ?) Amidst this current era of peak TV, shows on Netflix have permeated our everyday lives - taking up our ever-increasing attention. When evaluating new content deals or show renewals, Netflix measures for high engagement - namely, the expected hours of viewing for each single piece of content. The recommendation algorithm behind Netflix even utilizes Collaborative Filtering, which finds content suitable to your unique tastes, based on a similar group of people's tastes. Since I believe that my tastes are quite eclectic - I hope to gather insights into my own streaming history.
Working with Python, I applied Selenium to complete the web-scraping task. Selenium simulated user-clicks, allowing me to quickly progress through several web pages in a seamless manner. The steps I followed are outlined below:
1) Login Screen - User Profile
2) Site Navigation - Viewing History
3) URL generation - per unique view
4) Retrieving unique data points
How many shows did I binge-watch/ How often did I binge-watch?
Which shows did I binge-watch the most?
Is there a particular genre or actor I stuck to?
Challenges included deciphering the modern web language tags within the site.
Over the course of the year, I initially indulged in many episodes of Friends. Walking Dead was next. The difference in pace and genre of both shows demonstrates that binging is informally independent of that, as least for me.
As cord-cutters increasingly become more reliant on streaming for their TV fix - we'll likely see more research studies examining the exact moment we become "hooked". Was it a particular genre, number of episodes, unique to the actual content - an action scene, a storyline that held onto our attention?