NYC Data Science Academy| Blog
Bootcamps
Lifetime Job Support Available Financing Available
Bootcamps
Data Science with Machine Learning Flagship πŸ† Data Analytics Bootcamp Artificial Intelligence Bootcamp New Release πŸŽ‰
Free Lesson
Intro to Data Science New Release πŸŽ‰
Find Inspiration
Find Alumni with Similar Background
Job Outlook
Occupational Outlook Graduate Outcomes Must See πŸ”₯
Alumni
Success Stories Testimonials Alumni Directory Alumni Exclusive Study Program
Courses
View Bundled Courses
Financing Available
Bootcamp Prep Popular πŸ”₯ Data Science Mastery Data Science Launchpad with Python View AI Courses Generative AI for Everyone New πŸŽ‰ Generative AI for Finance New πŸŽ‰ Generative AI for Marketing New πŸŽ‰
Bundle Up
Learn More and Save More
Combination of data science courses.
View Data Science Courses
Beginner
Introductory Python
Intermediate
Data Science Python: Data Analysis and Visualization Popular πŸ”₯ Data Science R: Data Analysis and Visualization
Advanced
Data Science Python: Machine Learning Popular πŸ”₯ Data Science R: Machine Learning Designing and Implementing Production MLOps New πŸŽ‰ Natural Language Processing for Production (NLP) New πŸŽ‰
Find Inspiration
Get Course Recommendation Must Try πŸ’Ž An Ultimate Guide to Become a Data Scientist
For Companies
For Companies
Corporate Offerings Hiring Partners Candidate Portfolio Hire Our Graduates
Students Work
Students Work
All Posts Capstone Data Visualization Machine Learning Python Projects R Projects
Tutorials
About
About
About Us Accreditation Contact Us Join Us FAQ Webinars Subscription An Ultimate Guide to
Become a Data Scientist
    Login
NYC Data Science Acedemy
Bootcamps
Courses
Students Work
About
Bootcamps
Bootcamps
Data Science with Machine Learning Flagship
Data Analytics Bootcamp
Artificial Intelligence Bootcamp New Release πŸŽ‰
Free Lessons
Intro to Data Science New Release πŸŽ‰
Find Inspiration
Find Alumni with Similar Background
Job Outlook
Occupational Outlook
Graduate Outcomes Must See πŸ”₯
Alumni
Success Stories
Testimonials
Alumni Directory
Alumni Exclusive Study Program
Courses
Bundles
financing available
View All Bundles
Bootcamp Prep
Data Science Mastery
Data Science Launchpad with Python NEW!
View AI Courses
Generative AI for Everyone
Generative AI for Finance
Generative AI for Marketing
View Data Science Courses
View All Professional Development Courses
Beginner
Introductory Python
Intermediate
Python: Data Analysis and Visualization
R: Data Analysis and Visualization
Advanced
Python: Machine Learning
R: Machine Learning
Designing and Implementing Production MLOps
Natural Language Processing for Production (NLP)
For Companies
Corporate Offerings
Hiring Partners
Candidate Portfolio
Hire Our Graduates
Students Work
All Posts
Capstone
Data Visualization
Machine Learning
Python Projects
R Projects
About
Accreditation
About Us
Contact Us
Join Us
FAQ
Webinars
Subscription
An Ultimate Guide to Become a Data Scientist
Tutorials
Data Analytics
  • Learn Pandas
  • Learn NumPy
  • Learn SciPy
  • Learn Matplotlib
Machine Learning
  • Boosting
  • Random Forest
  • Linear Regression
  • Decision Tree
  • PCA
Interview by Companies
  • JPMC
  • Google
  • Facebook
Artificial Intelligence
  • Learn Generative AI
  • Learn ChatGPT-3.5
  • Learn ChatGPT-4
  • Learn Google Bard
Coding
  • Learn Python
  • Learn SQL
  • Learn MySQL
  • Learn NoSQL
  • Learn PySpark
  • Learn PyTorch
Interview Questions
  • Python Hard
  • R Easy
  • R Hard
  • SQL Easy
  • SQL Hard
  • Python Easy
Data Science Blog > Student Works > Data Analysis on Apple's Customer Satisfaction on Products

Data Analysis on Apple's Customer Satisfaction on Products

Lukas Frei
Posted on Oct 23, 2018
The skills the author demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.

Introduction

As a self-proclaimed tech-enthusiast, I've been following the tech review and data community, especially on YouTube, for quite a while. During that time, I recognized a certain pattern emerge after every new iPhone release: highly popular videos (as well as articles) would be released criticizing initial problems with the new iPhone.

Data Analysis on Apple's Customer Satisfaction on Products Data Analysis on Apple's Customer Satisfaction on Products

However, Apple's sales numbers do not seem to be impacted by this negative atmosphere around its release date. This made me wonder who is most impacted by these videos and articles and whether or not they affect Apple's customer satisfaction. Should these reviews, in fact, impact Apple's customers, a more loosely defined release schedule, giving Apple more time to refine new features of the iPhone, might increase Apple's customer satisfaction.

Why amazon.co.uk?

Data Analysis on Apple's Customer Satisfaction on Products

I chose to scrape amazon.co.uk for several reasons. Firstly, Europe, besides China, is arguably the most important foreign market for the iPhone. Thus, customer reviews from the UK present relevant information for Apple. Secondly, Amazon enabled me to not only gather information regarding the reviews themselves (Rating, Title, Text, Helpful Votes) but also regarding the Amazon user that published the review. By clicking on the username, I was able to collect information about each user, such as the total number of helpful votes and reviews as well as all published reviews.

In conducting my research, I chose to exclusively focus on reviews for the iPhone X from November 2017 to September 2018 to only gather reviews for the newest iPhone at that point in time. Future research could apply the same concept to older iPhones.

Workflow of Our Data

I used Selenium to scrape amazon.co.uk, mainly because of its flexibility in navigating between different websites.

After having scraped the data, I prepared and cleaned the data, mostly using Pandas, NumPy, and RE. This step included identifying and appropriately dealing with missing values, adapting my code, and reformatting the gathered data to enable further processing and analysis.

Then, I manipulated and analyzed the data, breaking it down into subgroups and comparing their characteristics.

Finally, I visualized the results of my analysis with matplotlib, seaborn, and wordcloud.

Number of Reviews per Month

The first part of my analysis focused on the number of reviews per month from the iPhone's first reviews in November 2017 until September 2018. Against my initial expectations, there were relatively little reviews around the time of the iPhone's release. Then, around December 2017, the number of reviews started significantly increasing, reaching its peak in March/April of 2018.

My suspicion regarding the relatively low number of reviews around the release is that, firstly, the iPhone is released later in Europe than it is in the US, possibly causing a delay in reviews. Furthermore, customers in the UK might have waited to purchase the iPhone until Christmas, which could explain the increase in December.

Verified versus Unverified Reviews

Next, I split the reviews into two categories: verified and unverified reviews. Generally, verified reviews clearly outnumbered unverified reviews. However, the only moment at which the number of unverified reviews exceeded the number of verified reviews was November 2017, directly after the iPhone's release. This finding fueled my suspicion that the generally negative atmosphere on the Internet at the time of the iPhone's release is not created by Apple customers, but by people disliking Apple as a company.

Furthermore, this line plot shows that the previously identified increase in reviews is almost exclusively driven by verified reviews, while unverified reviews decrease after November 2017 and never really increase again.

Having proven that the substantial difference between the number of verified and unverified reviews, I decided to dig deeper and compare the average rating of these two groups of reviews. The following box plot visualizes the result:

Averages

This box plot shows that the two types of reviews not only tremendously differ when it comes to the number of reviews but also the average rating. While the average ratings for verified reviews are clustered between 4.50 and 4.75 with the 1st and 3rd quartile being extremely close to each other, the average ratings of unverified reviews exhibit substantial variability. On top, the median of unverified average ratings at approximately 3.50 provides further evidence for my suspicion that actual Apple customers are generally very satisfied with their iPhone's and not affected by negative reviews on the Internet.

To confirm these findings, I took a closer look at reviews published in November 2017 grouped by the type of review:

While the relatively low number of reviews does not allow for very reliable conclusions, there still is a visible difference between the verified and unverified ratings with the verified ratings generally being more favorable than the unverified ratings.

Analyzing the Review Text and Data

In my next step, I exclusively focused on the review texts of verified reviews and after preparing the data, generated a word cloud that allows for a quick overview of the general sentiment among verified reviews:

Again, the majority of words in this word cloud are positive and express satisfaction with the iPhone X ("happy", "excellent", "best"). In addition, the delivery seems to have been very important for customers, which is not necessarily an insight for Apple, but definitely for Amazon. 

Finding negative words in this word cloud requires either very good eyes or a magnifying glass. If you possess either, you might be able to spot "crashed" or "smashed", however, the size and rarity of negative words provide further evidence for the high satisfaction of actual Apple customers.

Fake Review Index

At this point, I was almost ready to wrap up my project and include a few more describing visualizations to prove my point. 

However, after sampling some of the reviews, I started having doubts as to how many reviews were published by people actually having purchased the product. Some reviews, even verified reviews, seemed very suspicious to me. In order to decrease the uncertainty introduced by not knowing which reviews are real, I came up with something that I named the Fake Review Index.

To incorporate the Fake Review Index into my research, I decided to go back to scraping and gathered not only reviews but also relevant information regarding each user. Then, I used this information to calculate the Fake Review Index based on the following 6 factors:

Calculation Steps

I calculated the Fake Review Index by assigning weights and categorizing several scenarios for each of these factors. For instance, the highest score for the Number of Helpful Votes/Number of Reviews factor is 10. One scenario for this factor would be the following: if the result of this calculation for a given user returns a number smaller than or equal to 1 (= the user received less than or equal to 1 helpful votes for all of his posts on average), the user's Fake Review Index gets increased by 10. Therefore, the lower the Fake Review Index, the more likely it is that the review is genuine.

While the Fake Review Index is not extremely sophisticated (yet), I am very confident it is able to at least identify the most obvious fake reviews and filter them out. 

To illustrate the Fake Review Index, I included two scores of two different users that have published a review for the iPhone X:

This user received a Fake Review Index of 74.5. As evident from his reviews, he barely receives helpful votes and his reviews do not appear to be genuine whatsoever. Furthermore, this user mainly uses 5-star ratings and publishes several reviews for different products on the same day. Thus, he receives a high Fake Review Index Score.

The second user received a Fake Review Index of 27, making his reviews likely to be genuine. Indeed, this profile seems to belong to an active member of the Amazon review community as his reviews actually detail his experience with the product, do not only consist of 5-star ratings, and receive more helpful votes:

Findings

Finally, after grouping the reviews by their Fake Review Index, I made an interesting observation: the more likely a review is genuine, the lower the average rating. While the decrease is not extremely large, it is still significant, as the average rating for reviews that are very likely genuine is approximately 3.8. This rating is still very high and proves that Apple's customers are very satisfied, nevertheless, it is not as astronomically high as the previously discussed 4.5 - 4.75 average for verified reviews.

Wrap-Up

In summary, Apple's customer satisfaction is still very high. The negative atmosphere on the Internet at the time of the iPhone's release can mainly be attributed to non-customers. Apple's customer satisfaction, however, is not as high as one might suspect at first glance when considering how genuine the reviews are. To answer my initial research question, there does not seem to be an urgent need for Apple to implement a more loosely defined release schedule.

Future extensions of this project would include a more refined and sophisticated fake review index that could then be universally applied to different review websites to enable companies to filter out the most relevant reviews and trends among their customers.

About Author

Lukas Frei

Lukas Frei is an aspiring data scientist currently completing the 12-week bootcamp at the New York City Data Science Academy. Besides his passion for data, he is also very interested in business and finance. Lukas holds a BS...
View all posts by Lukas Frei >

Leave a Comment

Cancel reply

You must be logged in to post a comment.

No comments found.

View Posts by Categories

All Posts 2399 posts
AI 7 posts
AI Agent 2 posts
AI-based hotel recommendation 1 posts
AIForGood 1 posts
Alumni 60 posts
Animated Maps 1 posts
APIs 41 posts
Artificial Intelligence 2 posts
Artificial Intelligence 2 posts
AWS 13 posts
Banking 1 posts
Big Data 50 posts
Branch Analysis 1 posts
Capstone 206 posts
Career Education 7 posts
CLIP 1 posts
Community 72 posts
Congestion Zone 1 posts
Content Recommendation 1 posts
Cosine SImilarity 1 posts
Data Analysis 5 posts
Data Engineering 1 posts
Data Engineering 3 posts
Data Science 7 posts
Data Science News and Sharing 73 posts
Data Visualization 324 posts
Events 5 posts
Featured 37 posts
Function calling 1 posts
FutureTech 1 posts
Generative AI 5 posts
Hadoop 13 posts
Image Classification 1 posts
Innovation 2 posts
Kmeans Cluster 1 posts
LLM 6 posts
Machine Learning 364 posts
Marketing 1 posts
Meetup 144 posts
MLOPs 1 posts
Model Deployment 1 posts
Nagamas69 1 posts
NLP 1 posts
OpenAI 5 posts
OpenNYC Data 1 posts
pySpark 1 posts
Python 16 posts
Python 458 posts
Python data analysis 4 posts
Python Shiny 2 posts
R 404 posts
R Data Analysis 1 posts
R Shiny 560 posts
R Visualization 445 posts
RAG 1 posts
RoBERTa 1 posts
semantic rearch 2 posts
Spark 17 posts
SQL 1 posts
Streamlit 2 posts
Student Works 1687 posts
Tableau 12 posts
TensorFlow 3 posts
Traffic 1 posts
User Preference Modeling 1 posts
Vector database 2 posts
Web Scraping 483 posts
wukong138 1 posts

Our Recent Popular Posts

AI 4 AI: ChatGPT Unifies My Blog Posts
by Vinod Chugani
Dec 18, 2022
Meet Your Machine Learning Mentors: Kyle Gallatin
by Vivian Zhang
Nov 4, 2020
NICU Admissions and CCHD: Predicting Based on Data Analysis
by Paul Lee, Aron Berke, Bee Kim, Bettina Meier and Ira Villar
Jan 7, 2020

View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day ChatGPT citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay football gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income industry Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI

NYC Data Science Academy

NYC Data Science Academy teaches data science, trains companies and their employees to better profit from data, excels at big data project consulting, and connects trained Data Scientists to our industry.

NYC Data Science Academy is licensed by New York State Education Department.

Get detailed curriculum information about our
amazing bootcamp!

Please enter a valid email address
Sign up completed. Thank you!

Offerings

  • HOME
  • DATA SCIENCE BOOTCAMP
  • ONLINE DATA SCIENCE BOOTCAMP
  • Professional Development Courses
  • CORPORATE OFFERINGS
  • HIRING PARTNERS
  • About

  • About Us
  • Alumni
  • Blog
  • FAQ
  • Contact Us
  • Refund Policy
  • Join Us
  • SOCIAL MEDIA

    Β© 2025 NYC Data Science Academy
    All rights reserved. | Site Map
    Privacy Policy | Terms of Service
    Bootcamp Application