Data Analysis of Newegg Computer Monitors
I love computers. I have no problem sitting at my computer all day long working on various projects with R and Python. I love it. I like experimenting and learning new things. Since I use my computer a lot, I knew I would have to have a good computer, so I bought a gaming laptop. But, what is a gaming laptop without a great monitor? I picked the monitorI have because of very specifications like refresh rate, price, review rating, etc. My interest in computers led me to do some data analysis on computer monitors on Newegg.
The goal of this project was to try and figure out what features of a computer monitor would have the most positive affect on its price and review rating. In addition, I wanted to draw some business insights from my discoveries,and to give some recommendations to businesses based on my insights.
In order to achieve these goals, I need data. I used a free and open-source web-crawling framework called Scrapy. I scraped dozens of pages for computer monitors using this tool and it took approximately 70 minutes for me to get 1052 observations of computer monitors. I gathered various features from the dataset. The features are:
- Product Name
- Price (US $)
- Review Count
- Review Rating
- Refresh Rate (Hz)
- Response Time (ms)
- Screen Curvature
- Mount Compatibility
Let's take a closer look at some of these features. First, let's discuss some summary statistics about the price.
As you can see, the range of monitor prices is pretty significant. Because of this, The average price is a bit skewed towards the right. Some of the most expensive monitors were 3K, 4K, or 5K monitors; some of them even had curved screens, which is a fairly new development. This is one of the reasons why the range of prices was so great.
Now let's look at some summary statistics for Review Count.
Once again, the range of data is so great because of the outliers on the right. There were a hand-full of monitors that had almost 1000 reviews, but, as you can see from the median, most monitors did not get very many reviews. This was a very interesting finding to me because it shows that people do not leave reviews very often for Newegg monitors. One would think that companies would want people to leave reviews for their products. If the review is good, then this review would be a kind of "social-proof" for the quality of the product. If the review is bad, it could still be helpful to the business because it would alert them that they need to improve their product. Therefore, my first business insight is that manufactures need to give [potential] customers some incentive to leave a review on their products. Is a 1% discount a worthy incentive?
We have just seen that not many people leave reviews on the monitors they buy, but for those who do leave reviews, how do they fare?
Wow. This surprised me when I first saw this. Most people don't leave reviews on their monitors, but when they do, they usually leave fairly high reviews! At first glance, this appears to be true, but if you think about this for a bit, you might realize that this graph might be a bit misleading. How can this graph be misleading? Well, imagine if you had a technology company that built computer monitors and sold them on Newegg. If you had a monitor that was getting a large number of negative reviews and a small number of positive reviews, what would you do? Would you let the monitor stay on the sight and make your company look bad? or would you take the monitor off of Newegg and go back to the drawing-board?
Most manufacturers take a badly performing monitor off of the site to protect their reputation. The monitors that are performing well are left on the sight. This is the main reason why I said this graphic is potentially misleading.
We are trying to find out which of the aforementioned features has the greatest impact on the price of a monitor and it's review rating. One of the basic ways of discovering how strong an influence features may have on each other is examining a correlation matrix like the one below.
That might be a lot to take in, so let's zoom in on the price and review rating to see what we can discover.
Strong correlations are generally less than -0.5 and greater than 0.5. As you can see, there doesn't appear to be any strong correlations. Hmmm... let's take a look at price and review rating in a different way and see what we can find out...
You might look at this boxplot and wonder if there is a positive correlation between price and review rating. According to the correlation coefficient, you would be right, but the coefficient is very close to zero, so it does not give us much.
Remember when we looked at the distribution of review count and we saw that there were a few outliers with close to 1000 reviews? Let's take a look at them and see what kind of information we can glean.
Here we have a chart that shows us the top 5 monitors in order of review count. Notice anything common among these 5 monitors? They are all made by ASUS! This is quite an accomplishment because one would expect the reviews to move toward the middle of the spectrum as the review count increases (since there would presumably be both satisfied customers and dissatisfied customers and one would expect the two to even out), but this is not the case with ASUS. They have 4/5 stars for each of their top 5 monitors by review count. Does ASUS have very many 5-star monitors?
The above chart gives the top four monitors with 5-star ratings and orders them by review count. As you can see, ASUS has two of the top four! And look at the prices of the ASUS monitors versus the others. ASUS charges way less than its competitors and produces the same quality. This is why I say that ASUS is doing more with less; they are charging lower prices for comparable quality. Their low prices and high quality may have something to do with why so many of their customers leave reviews. These reviews get read and encourage others to buy. Then these customers leave reviews and the cycle keeps going on and on.
Here are some business insights that I gathered from my analysis:
- Do more with less: ASUS best monitors were also its cheapest monitors. This means that a higher price does not always imply higher quality.
- Give some incentive to customers for writing reviews: Review counts were low, but scores were high. Good reviews are "social-proof" of the quality of your product. Bad reviews are still useful to the manufacturer because it lets them know that they need to go back to the drawing-board.
Thank you for reading this article about my data analysis on Newegg computer monitors. I really enjoyed doing this research and I hope you enjoyed reading about it!
The skills I demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.