Data Analysis of Newegg Computer Monitors

Posted on Nov 9, 2020

I love computers. I have no problem sitting at my computer all day long working on various projects with R and Python. I love it. I like experimenting and learning new things. Since I use my computer a lot, I knew I would have to have a good computer, so I bought a gaming laptop. But, what is a gaming laptop without a great monitor? I picked the monitorI have because of very specifications like refresh rate, price, review rating, etc. My interest in computers led me to do some data analysis on computer monitors on Newegg.

The goal of this project was to try and figure out what features of a computer monitor would have the most positive affect on its price and review rating. In addition, I wanted to draw some business insights from my discoveries,and to give some recommendations to businesses based on my insights.

In order to achieve these goals, I need data. I used a free and open-source web-crawling framework called Scrapy. I scraped dozens of pages for computer monitors using this tool and it took approximately 70 minutes for me to get 1052 observations of computer monitors. I gathered various features from the dataset. The features are:

  1. Product Name
  2. Price (US $)
  3. Review Count
  4. Review Rating
  5. Resolution
  6. Refresh Rate (Hz)
  7. Response Time (ms)
  8. Flicker-Free
  9. Screen Curvature
  10. Mount Compatibility
  11. Adjustable

Let's take a closer look at some of these features. First, let's discuss some summary statistics about the price.

 

As you can see, the range of monitor prices is pretty significant. Because of this, The average price is a bit skewed towards the right. Some of the most expensive monitors were 3K, 4K, or 5K monitors; some of them even had curved screens, which is a fairly new development. This is one of the reasons why the range of prices was so great.

Now let's look at some summary statistics for Review Count.

 

Once again, the range of data is so great because of the outliers on the right. There were a hand-full of monitors that had almost 1000 reviews, but, as you can see from the median, most monitors did not get very many reviews. This was a very interesting finding to me because it shows that people do not leave reviews very often for Newegg monitors. One would think that companies would want people to leave reviews for their products. If the review is good, then this review would be a kind of "social-proof" for the quality of the product. If the review is bad, it could still be helpful to the business because it would alert them that they need to improve their product. Therefore, my first business insight is that manufactures need to give [potential] customers some incentive to leave a review on their products. Is a 1% discount a worthy incentive?

We have just seen that not many people leave reviews on the monitors they buy, but for those who do leave reviews, how do they fare? 

 

Wow. This surprised me when I first saw this. Most people don't leave reviews on their monitors, but when they do, they usually leave fairly high reviews! At first glance, this appears to be true, but if you think about this for a bit, you might realize that this graph might be a bit misleading. How can this graph be misleading? Well, imagine if you had a technology company that built computer monitors and sold them on Newegg. If you had a monitor that was getting a large number of negative reviews and a small number of positive reviews, what would you do? Would you let the monitor stay on the sight and make your company look bad? or would you take the monitor off of Newegg and go back to the drawing-board? 

Most manufacturers take a badly performing monitor off of the site to protect their reputation. The monitors that are performing well are left on the sight. This is the main reason why I said this graphic is potentially misleading. 

We are trying to find out which of the aforementioned features has the greatest impact on the price of a monitor and it's review rating. One of the basic ways of discovering how strong an influence features may have on each other is examining a correlation matrix like the one below. 

 

The numbers represent the strength of the linear relationship between the two features. The most important features here are the price and review rating.

That might be a lot to take in, so let's zoom in on the price and review rating to see what we can discover.

 

There doesn't seem to be any strong correlations here.

Strong correlations are generally less than -0.5 and greater than 0.5. As you can see, there doesn't appear to be any strong correlations. Hmmm... let's take a look at price and review rating in a different way and see what we can find out...

 

Is there a positive correlation here?

You might look at this boxplot and wonder if there is a positive correlation between price and review rating. According to the correlation coefficient, you would be right, but the coefficient is very close to zero, so it does not give us much. 

Remember when we looked at the distribution of review count and we saw that there were a few outliers with close to 1000 reviews? Let's take a look at them and see what kind of information we can glean.

 

ASUS has the top 5 locked down.

Here we have a chart that shows us the top 5 monitors in order of review count. Notice anything common among these 5 monitors? They are all made by ASUS! This is quite an accomplishment because one would expect the reviews to move toward the middle of the spectrum as the review count increases (since there would presumably be both satisfied customers and dissatisfied customers and one would expect the two to even out), but this is not the case with ASUS. They have 4/5 stars for each of their top 5 monitors by review count. Does ASUS have very many 5-star monitors?

 

ASUS has 2 of the top 4 in this chart!

The above chart gives the top four monitors with 5-star ratings and orders them by review count. As you can see, ASUS has two of the top four! And look at the prices of the ASUS monitors versus the others. ASUS charges way less than its competitors and produces the same quality. This is why I say that ASUS is doing more with less; they are charging lower prices for comparable quality. Their low prices and high quality may have something to do with why so many of their customers leave reviews. These reviews get read and encourage others to buy. Then these customers leave reviews and the cycle keeps going on and on.

Here are some business insights that I gathered from my analysis:

  • Do more with less: ASUS best monitors were also its cheapest monitors. This means that a higher price does not always imply higher quality.
  • Give some incentive to customers for writing reviews: Review counts were low, but scores were high. Good reviews are "social-proof" of the quality of your product. Bad reviews are still useful to the manufacturer because it lets them know that they need to go back to the drawing-board.

Thank you for reading this article about my data analysis on Newegg computer monitors. I really enjoyed doing this research and I hope you enjoyed reading about it!

The skills I demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.

About Author

Mark Carthon

Data Scientist with 7+ years of work experience and a strong mathematical background. Passionate about applications of Machine Learning and Deep Learning in industry.
View all posts by Mark Carthon >

Leave a Comment

No comments found.

View Posts by Categories


Our Recent Popular Posts


View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI