Implementing Business Strategies from YouTube Videos

Posted on Nov 24, 2022

The global influence of YouTube

As a society, we're emotionally attached to YouTube and have allowed it to play a significant role in our lives. With 2+ billion monthly active users spanning a variety of different genres, YouTube has revolutionized the information age. Whether or not this platform is influential to you, it can't be denied that its global influence continuously challenges the way people think, act, and navigate throughout the world.

The goal of this project is to demonstrate YouTube's global influence and how it can be applied to different marketing strategies that can help elevate product sales.

About the data: What's trending on Youtube?

  • Kaggle
  • 11 datasets
  • 10 countries
  • 8 months
  • ~400K observations


Here are the various questions that will be answered throughout the post:

  • Which regions will provide the highest ROI for online advertisements?
  • What marketing tactics are most effective for consumers with different interests?
  • When are consumers more likely to shop online for products?
  • What infrastructure requisites are necessary for data center upkeep?
  • How do we use viral data to predict consumer buying trends?
  • Which products are consumers more likely to buy based on their interests?

Throughout this post, I'll be using the word consumer a lot.

  • Viewers are consumers.
  • Content is product.
  • Digital currency is bitcoin time.

I. Effective Marketing

When we dine at the new restaurant in town or rack up our credit cards during the Cyber Monday sale, we'll often have some sort of opinion about the product we had just purchased. Despite our opinions, we're more likely to buy and forget rather than echo our thoughts to the community. In simpler terms, consumers are less likely to leave feedback and reviews. This unfortunate tragedy of the commons help neither the consumer nor the company as valuable feedback can lead to crucial product improvements that can enhance consumer experience.

Therefore, I decided to analyze the likelihood for a consumer to engage in the product they purchase.

This graph above portray a consumer's interaction with a product based on impression. By taking the median values of both likes and dislikes, we can see that consumers are about 20x more vocal for products they like. However, it isn't enough for us to just analyze how a consumer will engage with a product - we need to know how often a consumer will engage in the product.

This graph explores the total engagement (comments) vs. products purchased (views). The graph is also divided into different consumer interests (categories), which reflect a more accurate representation of how a consumer will engage in the product they had just purchased. Using the same methodology as before, we can see that the total consumer engagement ratio is around 0.05% per category.

What this essentially means is that every feedback and interaction with the product carry significant weight - especially when 5% of that weight are dislikes (based on the previous former analysis). If you're thinking, "that ratio sounds ridiculously low", you might be right. However, a study shown here show that a good engagement ratio is typically 0.04 - 16%, depending on the platform and product being sold.

So if consumer engagement is both valuable and scarce, how do we obtain more of it? By raking in more consumers, of course!

What you see above are the varying levels of consumer attention spans based on their different interests - interest spans, if you will. By analyzing the video title lengths against different consumer interests, we can analyze what it takes for us to steal the consumer's attention. For instance, from this data we can conclude that consumers have a higher attention span for movies than TV shows; thus, budgets for movies should be significantly higher than TV shows to ensure that their production quality is consistent throughout.

II. Energy Consumption

Now that we know the how, let's talk about the when. In this segment, I'll be correlating energy consumption to video usage. Analyzing energy usage can provide some insights into price optimization, hardware resiliency, and marketing tactics.

If we track the weekly consumer energy usage, we see an expected trend: energy usage starts to increase when people are starting to wind down from their busy day. A second inspection show that there's a drastic spike on Friday mornings as well. This will be relevant soon.

One key note to remember is that though this analysis tracks the energy usage, it tracks it from trending videos. In other words, consumers watching trending videos are more likely to be in a relaxed state of mind. This report show that relaxed consumers are 12% more likely to buy products. In other words, ads are only as effective as its impact: calm the consumer, boost the sales.

Using the same type of time analysis, we can also track how consumers interact with products year-over-year.

As the data only spanned 8 months, I had to first normalize based on the number of months existent within that specific year. From this chart, an immediate thing sticks out: Energy usage at night (10PM - 2AM) had been consistently higher in 2018 than in 2017. This is significant because consumers during this time are in a heightened relaxed state.

As consumers are fully winding down to end their night, they're in a position with less distractions. They're more likely to be engaged with their phones. A previous study has also shown that consumer engagement is more influential at night. Since screen time is positively correlated with online product sales, this time slot is most optimal to influence consumer buying decisions.

If we look at the two charts above, we can see a fairly similar trend: energy usage tends to increase ~ 3PM and again at 10PM. This can lead us to prematurely conclude that online advertisements should be increased during this time window. However, we'd be foolish to develop and apply the same business strategy globally.

The music is nothing if the audience is deaf. What I mean by this is marketing strategies applied to companies in the US should be reversed for Asian countries like Japan and Korea, where the energy peaks are flipped. We can also see additional energy usage trends: India is more consistent, Mexico is more significant, and France is more emphasized.

Another thing we can gather from this chart is the additional insight into the higher peak on Friday mornings ~5AM. Asian countries, like Japan and Korea, see a significant boost in nightlife on Friday nights, which contribute to the ~5AM peak as consumers will get up to travel, spend, and enjoy the weekend.

From this data, we can explore how to effectively reach our target audience. For instance, B2B companies can potentially charge more for ads that run during peak times for each country - which is guaranteed to boost more sales than throughout the rest of the day.

We can also capture infrastructure requisites from this data. I had said before that energy usage in India is more consistent than other countries. What this can teach us is that hardware sold and built in India - phones, TV, computers, data centers - should be much more resilient than in other countries. This is mainly because energy usage is much higher and consistent, so it contains a greater risk for hardware to overheat. Additional runner-up countries for hardware resiliency should be France and Japan, whose wind-down peaks are much higher than other countries.

But what type of product is most effective?

III. Exploratory Analysis.

Understanding how and when to reel in a consumer is only half the battle; the other half is the search for the perfect product.

The quest for the perfect product begins by analyzing the top consumer interests, by country. For instance, this data tells us that the Entertainment Industry dominates throughout the world, except in Great Britain. As consumers have a higher demand for music products in GB, we can potentially see opportunities to increase record sales, concert frequencies, and product margins for music platforms like Spotify, SoundCloud, and YouTube.

We can also see that consumer demand for sports are much higher in France, Japan, and Mexico. Therefore, we can expect higher margin for sport merchandise, such as FIFA jerseys, sport equipments, tournament stadiums.

To effectively boost our sales, however, we need to take advantage of the most powerful tool in product marketing: the power of word-of-mouth.

The graph above show us the different reaction times of a trending video product. We can see here that information travel differently based on a consumer's interest. For instance, we can expect that music labels will take a longer time to post on the billboard than movies at the box office. Therefore, from the same advertising budget, we can expect Netflix to increase advertising frequency in the beginning whereas music labels should fiscally conserve for longer advertising duration.

So now that we're all experts at the speed of information, how do we increase the velocity of that information? As it turns out, consumers are more likely to buy a product when it's recommended through word-of-mouth, even if it is from a stranger (cite here). This leads me to the next phase: deconstructing consumer spending habits.

The three-fold approach is this:

  1. Identify which trending videos contain Amazon links in its description.
  2. Aggregate count on the categories containing the Amazon links.
  3. Scrape the Amazon links to extract the product category.


From the graph above, we can see that word-of-mouth travel most dominantly within the entertainment sector. Due to this, we can conclude that companies can boost their product sales by partnering up with influencers within the entertainment industry. However, we'd need to take a closer inspection at what type of products influencers sell, as they correlate with the influencer's audience quite heavily.

This can be done by scraping the ~10,000 unique Amazon links using BeautifulSoup. With this method, I was able to get a good grasp at which product category is more dominant in affiliate marketing:

  • 50% Electronics
  • 17% Home Goods
  • 8% Beauty
  • 8% DIY
  • 17% Other

From this data, we can make see that:

  1. Electronics is a top performer in affiliate marketing.
  2. Entertainment is a top performer in affiliate marketing partners.


Using trending videos on YouTube to implement business strategies was no easy feat, but it all boils down to understanding the products that drive the consumers. To summarize above, here's what we covered, but with a slightly different take:

  • Effective Marketing: Analyzing consumer interactions to understand what they want.
  • Energy Consumption: Evaluating when consumers are more likely to purchase products.
  • Exploratory Analysis: Optimizing how the ROI and product margin can increase YOY.

Future work

While the dataset was fascinating, its analysis was limited by time. Additional questions I would like to answer are:

  • What impact did CoVID have in consumer demand?

Find additional datasets/observations to support the trend hypothesis.

  • What's the dominating marketplace platform for each category?

Scrape more than just Amazon links: eBay, StockX, Etsy, etc.

  • How do interests vary across platforms and age groups?

Compare trends from parallel social media platforms, including TikTok and Instagram.


Check out the data for yourself

Find the project's GitHub Link here.

About Author

Daniel Setiawan

Hi, I'm Daniel! I currently work as an R&D Engineer at a startup in Berkeley, CA. In my role, I interface with various electronics (RF/DC) and code (Python) to analyze and improve product performance. I’m intrigued by science...
View all posts by Daniel Setiawan >

Related Articles

Leave a Comment

No comments found.

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI