Data Analysis of Newegg Computer Monitors

Posted on Nov 9, 2020

I love computers. I have no problem sitting at my computer all day long working on various projects with R and Python. I love it. I like experimenting and learning new things. Since I use my computer a lot, I knew I would have to have a good computer, so I bought a gaming laptop. But, what is a gaming laptop without a great monitor? I picked the monitorI have because of very specifications like refresh rate, price, review rating, etc. My interest in computers led me to do some data analysis on computer monitors on Newegg.

The goal of this project was to try and figure out what features of a computer monitor would have the most positive affect on its price and review rating. In addition, I wanted to draw some business insights from my discoveries,and to give some recommendations to businesses based on my insights.

In order to achieve these goals, I need data. I used a free and open-source web-crawling framework called Scrapy. I scraped dozens of pages for computer monitors using this tool and it took approximately 70 minutes for me to get 1052 observations of computer monitors. I gathered various features from the dataset. The features are:

  1. Product Name
  2. Price (US $)
  3. Review Count
  4. Review Rating
  5. Resolution
  6. Refresh Rate (Hz)
  7. Response Time (ms)
  8. Flicker-Free
  9. Screen Curvature
  10. Mount Compatibility
  11. Adjustable

Let's take a closer look at some of these features. First, let's discuss some summary statistics about the price.


As you can see, the range of monitor prices is pretty significant. Because of this, The average price is a bit skewed towards the right. Some of the most expensive monitors were 3K, 4K, or 5K monitors; some of them even had curved screens, which is a fairly new development. This is one of the reasons why the range of prices was so great.

Now let's look at some summary statistics for Review Count.


Once again, the range of data is so great because of the outliers on the right. There were a hand-full of monitors that had almost 1000 reviews, but, as you can see from the median, most monitors did not get very many reviews. This was a very interesting finding to me because it shows that people do not leave reviews very often for Newegg monitors. One would think that companies would want people to leave reviews for their products. If the review is good, then this review would be a kind of "social-proof" for the quality of the product. If the review is bad, it could still be helpful to the business because it would alert them that they need to improve their product. Therefore, my first business insight is that manufactures need to give [potential] customers some incentive to leave a review on their products. Is a 1% discount a worthy incentive?

We have just seen that not many people leave reviews on the monitors they buy, but for those who do leave reviews, how do they fare? 


Wow. This surprised me when I first saw this. Most people don't leave reviews on their monitors, but when they do, they usually leave fairly high reviews! At first glance, this appears to be true, but if you think about this for a bit, you might realize that this graph might be a bit misleading. How can this graph be misleading? Well, imagine if you had a technology company that built computer monitors and sold them on Newegg. If you had a monitor that was getting a large number of negative reviews and a small number of positive reviews, what would you do? Would you let the monitor stay on the sight and make your company look bad? or would you take the monitor off of Newegg and go back to the drawing-board? 

Most manufacturers take a badly performing monitor off of the site to protect their reputation. The monitors that are performing well are left on the sight. This is the main reason why I said this graphic is potentially misleading. 

We are trying to find out which of the aforementioned features has the greatest impact on the price of a monitor and it's review rating. One of the basic ways of discovering how strong an influence features may have on each other is examining a correlation matrix like the one below. 


The numbers represent the strength of the linear relationship between the two features. The most important features here are the price and review rating.

That might be a lot to take in, so let's zoom in on the price and review rating to see what we can discover.


There doesn't seem to be any strong correlations here.

Strong correlations are generally less than -0.5 and greater than 0.5. As you can see, there doesn't appear to be any strong correlations. Hmmm... let's take a look at price and review rating in a different way and see what we can find out...


Is there a positive correlation here?

You might look at this boxplot and wonder if there is a positive correlation between price and review rating. According to the correlation coefficient, you would be right, but the coefficient is very close to zero, so it does not give us much. 

Remember when we looked at the distribution of review count and we saw that there were a few outliers with close to 1000 reviews? Let's take a look at them and see what kind of information we can glean.


ASUS has the top 5 locked down.

Here we have a chart that shows us the top 5 monitors in order of review count. Notice anything common among these 5 monitors? They are all made by ASUS! This is quite an accomplishment because one would expect the reviews to move toward the middle of the spectrum as the review count increases (since there would presumably be both satisfied customers and dissatisfied customers and one would expect the two to even out), but this is not the case with ASUS. They have 4/5 stars for each of their top 5 monitors by review count. Does ASUS have very many 5-star monitors?


ASUS has 2 of the top 4 in this chart!

The above chart gives the top four monitors with 5-star ratings and orders them by review count. As you can see, ASUS has two of the top four! And look at the prices of the ASUS monitors versus the others. ASUS charges way less than its competitors and produces the same quality. This is why I say that ASUS is doing more with less; they are charging lower prices for comparable quality. Their low prices and high quality may have something to do with why so many of their customers leave reviews. These reviews get read and encourage others to buy. Then these customers leave reviews and the cycle keeps going on and on.

Here are some business insights that I gathered from my analysis:

  • Do more with less: ASUS best monitors were also its cheapest monitors. This means that a higher price does not always imply higher quality.
  • Give some incentive to customers for writing reviews: Review counts were low, but scores were high. Good reviews are "social-proof" of the quality of your product. Bad reviews are still useful to the manufacturer because it lets them know that they need to go back to the drawing-board.

Thank you for reading this article about my data analysis on Newegg computer monitors. I really enjoyed doing this research and I hope you enjoyed reading about it!

About Author

Mark Carthon

Data Scientist with 7+ years of work experience and a strong mathematical background. Passionate about applications of Machine Learning and Deep Learning in industry.
View all posts by Mark Carthon >

Leave a Comment

No comments found.

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 airbnb Alex Baransky alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus API Application artist aws beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep Bundles California Cancer Research capstone Career Career Day citibike clustering Coding Course Demo Course Report D3.js data Data Analyst data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization Deep Learning Demo Day Discount dplyr employer networking feature engineering Finance Financial Data Science Flask gbm Get Hired ggplot2 googleVis Hadoop higgs boson Hiring hiring partner events Hiring Partners Industry Experts Instructor Blog Instructor Interview Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter lasso regression Lead Data Scienctist Lead Data Scientist leaflet linear regression Logistic Regression machine learning Maps matplotlib Medical Research Meet the team meetup Networking neural network Neural networks New Courses nlp NYC NYC Data Science nyc data science academy NYC Open Data NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time Portfolio Development prediction Prework Programming PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn Selenium sentiment analysis Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau team TensorFlow Testimonial tf-idf Top Data Science Bootcamp twitter visualization web scraping Weekend Course What to expect word cloud word2vec XGBoost yelp