Data Analysis of the TV Market with API Data from BestBuy
The skills I demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.
Data Analysis of the TV Market with API Data from BestBuy's online shop
Today lots of people are holed up in their own rooms, streaming Netflix or Disney+ on their phone. But there are still many of us who watch TV shows on an actual Television. The TV market, despite existing for nearly a century, is thriving. Last year (2020) was a record year in terms of total market revenue, and is forecast to maintain a 10% annual growth rate through at least 2024. However, the market is fiercely competitive and a few brands from different countries vie for market-share dominance. There are also new market entrants from China (e.g. TCL, Hisense, Xiaomi) quickly gaining ground.
BestBuy is a specialty retailer focused on consumer electronics, and its online shop is considered one of the best platforms for buying TVs (behind Amazon and Walmart, both of which do not specialize in electronics). Using BestBuy.com's online retail data as a proxy for today’s TV marketplace, I sought insight on the competitive landscape and customer purchasing behavior. The insight should prove useful to TV manufacturers in their quest for market-share.
Raw data was obtained through BestBuy.com's API, which contained 465 models of TVs with 34 attributes. After cleaning the data, ie. eliminating the models no longer available both in-store and online, the resulting dataset contained 387 rows. For additional detail on data munging (like determining outliers, binning, calculation of TV “bezel”) and data harmonization (extracting warranty length and converting to years), please refer to my github.
Televisions are categorized into four classes of size: extra-large, large, mid, and small. Today’s TVs all utilize some form of light-emitting diode (LED) display, with about a third using ultra-high definition output technology, such as OLED or QLED. Also 91% of all TVs on the market are smart capable (meaning it has internet connectivity and support for a range of apps, such as streaming services and even internet browsers).
|TV Size Class||XLARGE||LARGE||MID||SMALL|
|Measurement (inches)||> 60”||45"-60"||32"-45"||< 31"|
Signs of Market Segmentation
The stacked bar charts below show, from left to right, the proportion of TVs sold on the market by each brand, the total review counts received by brand, and review counts by brand, separated by size class. Although we aren’t able to obtain actual sales figures, I’d like to use review counts as a proxy for sales, especially since BestBuy boasts that they are “verified purchase reviews.” The three leading brands, Samsung, LG, and Sony, account for over 60% of the TVs available on the market, and they also account for 60% of all sales. However, in the size segmented sales figures, there are stark differences. Those same 3 brands have over 80% of all extra-large TV sales. However, when it comes to small TVs, the big three only account for less than 25% of TV sales.
TVs today come in a wide range of prices, and we can see additional signs of market segmentation through a strip plot of price and categorized by brand. The 3 leading brands mentioned previously price their products on the high end, with median price and a large portion of the inter-quartile range falling well above the median price of all TVs. On the other hand, the other remaining brands price most of their products below the median line.
From both the differences in distribution of sales in each size class, and the pricing strategy divergence among different brands, we can discern a market that is segmented into two halves: (1) more expensive premium TVs of a larger size and better LED tech, and (2) non-premium TVs of mid/small size and perhaps basic LED tech.
Finding Correlates with Review Score and Review Count
Moving forward with the market analysis, we plot a correlation heatmap with review scores and review count as target attributes. Unfortunately, nothing exhibited any positive or negative relation with the target attributes, with the exception of a -0.51 coefficient between price and review score for mid-size TVs. This suggests buyers of those TVs may be price sensitive, something we will investigate further in the next section.
Investigating Consumer Purchasing Behavior
For each product webpage, BestBuy.com makes a recommendation of other products called “Ultimately Bought” (UB). It's a list of the top 10 products purchased by other shoppers who also viewed the original product. Fortunately, this is offered as an endpoint on the API, so a second round of requests to the API was made with the SKU code for each TV, returning 10 additional “UB'' product SKUs.
With the additional data, some aspects of consumer buying could be obtained, namely price sensitivity and brand loyalty.
Price Sensitivity or Shopper Price “Stickiness”
The price difference between the original TV and the average price of UB TVs was calculated, and the density histogram is displayed below.
Most shoppers did not stray far from the original price point. However, when the market is segmented into premium and budget (non-premium) TVs, stark differences between customers can be observed. Premium TVs are represented in the figure below in yellow, and their price differences did not cluster near 0. Premium TV buyers often ultimately bought a higher priced TV.
Budget TV buyers, on the other hand, exhibited even more price “stickiness” between the viewed and UB product and the median viewed TV price was almost equal to the median UB TV price. This price sensitivity of budget buyers may explain the negative correlation between review score and price for mid-sized televisions seen previously.
Customer Brand Loyalty
Of the ultimately bought TVs, a simple count of those that were of the same brand as the originating TV was tallied and graphed as a boxplot to get a sense of the degree of customer loyalty. The 3 dominant brands exhibited relatively high customer loyalty. The interquartile range for both Sony and Samsung sit above the average of all brands, while LG sits on the average and all remaining brands fall short.
Sony, Samsung, and LG price a large portion of their TV models in the premium TV range. Accordingly, the data suggests premium TV customers favor a particular brand while budget TV customers are more willing to switch during browsing.
The data showed obvious signs of market segmentation into premium and budget TVs. Premium TV customers are not sensitive to price, often buying at a higher price than the TV originally viewed, although they tend to stick with the brand. Manufacturers of premium televisions should, therefore, focus on marketing to boost brand recognition and improve customer perception.
Budget TV customers exhibited price “stickiness” and review scores of mid-sized televisions negatively correlated with higher price. Consequently, budget TV sellers should focus on cutting costs and price competitively.
BestBuy’s online retail shop, although comprehensive, may not represent the market as a whole. Xiaomi’s televisions are not currently sold on BestBuy.com. BestBuy also sells their own Insignia brand of TVs, which could have benefited from preferential promotions on the website. The count of reviews for a product was used as a proxy to represent sales, which may be inaccurate and biased due to possible review filtering practices by the site.
As next steps to the analysis, pricing and customer purchasing data can be cross referenced with that of Amazon, Walmart, or B&H, etc., to increase the robustness of the study and decrease bias for BestBuy’s own “Insignia” TV brand. And although customer review scores did not exhibit much correlation to most attributes in this analysis, it doesn’t mean they don’t represent customer satisfaction. NLP sentiment analysis can be performed on the actual text data of the reviews.