Food delivery: a new revenue source but also more complexity to manage

Posted on Oct 29, 2017


Over the past decade the number of consumer review websites, such as, has exploded. These websites allow consumers to share their experiences about service, product quality, restaurant environment and other aspects. Nowadays, it is very easy to acquire information from countless other consumers about restaurants, hotels, products and it shows a significant impact in the businesses.

Another significant change in the last years was the market for food delivery that keeps growing with the creation of several websites and apps delivering meals from restaurants that sometimes haven’t traditionally offered the option food to-go. For restaurant owners, the extra business is often welcomed, but introducing a third party can create a large number of problems.

So, given that bad reviews can harm the business and having the delivery service as a new factor to be reviewed, this study is intended to analyze how reviews from Yelp website can be compared with Seamless website (delivery service)?

Data Collection

On, I used Scrapy to web scrape 393,314 reviews from 570 restaurants in New York City.

On, I used Selenium to web scrape 335,169 reviews from 5,612 restaurants in New York City.

The number of reviews per borough, from each website, can be identified in the charts on the left and the number of restaurants, per borough and price, can be identified in the charts on the right:








Analysis on Yelp database:

Before joining the databases form both websites to do the final analysis, a specific analysis on Yelp data was performed.

When analyzing the restaurants from different boroughs, it is possible to notice that Manhattan, Queen and Staten Island show restaurants with higher rates, where 75% of the restaurants have the overall rating between 4 and 5.

The main purpose of this study is to analyze how the delivery service impacts the review rates, the following chart shows the user rates whether the restaurant has delivery service or not based on Yelp reviews. It was possible to notice that restaurants with delivery service have a wider percentage of restaurants with lower rates.

In order to check if lower rates on restaurants with delivery service is a general behavior in New York City or if it changes from one borough to another, I plotted the following chart.Β  Manhattan and Brooklyn show the same behavior, but Queens shows the opposite behavior and it seems to have no difference on delivery service in Bronx. This indicates that the infrastructure/traffic/service of the borough might have an impact on reviews.

Analysis on Seamless database:

Likewise Yelp, the same analysis was performed on Seamless data. Seamless make available information about what people are saying on reviews related to the quality of food, quality of the delivery and the quality of the order made on the website. I plotted a box plot to check if these variables could be related to the overall rate of the restaurants.

It looks like there is no significant "bad" reviews related to if the order was accurate or not (when the food is delivery accordingly to the order made on the website/app).

Analysis on Seamless and Yelp combined database:

To perform the complete analysis, comparing restaurants that are in both website I joined the databases ending up with a total of 135 restaurants. The total number of reviews from these restaurants on both websites are shown in the chart below.

Initially I plotted a box plot comparing the rated of these restaurants per borough and you can see difference in some boroughs, as follows:

If you see the overall rating from the restaurants together if it doesn't seem to be very different but to confirm that, I plotted the results for restaurants that had more than 1,000 reviews on Yelp. The chart shows that some restaurants do not match but it looks like they usually have similar results.

Regardless of what both charts showed, I tested their correlation (-0.0215) which means a very low correlation. I also ran a two-sample t-test and the p-value was 1.9931e-223 which means that the samples are unlikely to have the same mean. So, the overall rate of people using Yelp (most of the cases going into a restaurant) is different from ordering food online using Seamless.


This study was able to show that Yelp and Seamless have different overall rates for the same restaurant. So, before starting food delivery service, a restaurant needs to aware of the new factors that it may bring to the restaurant management.

Some concerns related to starting delivery service are quality and temperature of the food, more orders on the restaurants of peak dinner times, the prices on online services may be higher than in the restaurant menu, which may lead to a bad delivered food experience.

A deeper study can be done using sentimental analysis on the reviews to gather more information to prove this approach.

If you want to see more information about this study, you can check my GitHub.

About Author

Neuton Fonseca

MBA in Business Analytics and Big Data (ongoing) and recent certification as Data Scientist with an engineering background alongside with 7 years of corporate business experience. A problem solver with passion to gather and analyze data to drive...
View all posts by Neuton Fonseca >

Related Articles

Leave a Comment

No comments found.

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 airbnb Alex Baransky alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus API Application artist aws beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep Bundles California Cancer Research capstone Career Career Day citibike clustering Coding Course Demo Course Report D3.js data Data Analyst data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization Deep Learning Demo Day Discount dplyr employer networking feature engineering Finance Financial Data Science Flask gbm Get Hired ggplot2 googleVis Hadoop higgs boson Hiring hiring partner events Hiring Partners Industry Experts Instructor Blog Instructor Interview Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter lasso regression Lead Data Scienctist Lead Data Scientist leaflet linear regression Logistic Regression machine learning Maps matplotlib Medical Research Meet the team meetup Networking neural network Neural networks New Courses nlp NYC NYC Data Science nyc data science academy NYC Open Data NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time Portfolio Development prediction Prework Programming PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn Selenium sentiment analysis Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau team TensorFlow Testimonial tf-idf Top Data Science Bootcamp twitter visualization web scraping Weekend Course What to expect word cloud word2vec XGBoost yelp