Using Data to Find the Best Gluten Free Restaurants

Posted on Jan 21, 2021

All Python Scrapy code & Jupyter notebook for EDA / Visualizations:

Why I Scraped Find is a go-to website in the gluten free / celiac community. The platform works as a user-based Yelp-esque search engine for restaurants with gluten free offerings. This focus makes it much easier to find restaurants that are safe (and delicious!) for users who are on a gluten free / celiac diet. By scraping this website, one can determine which metropolitan areas have the best gluten free scenes – which is useful when considering which city to accept a new job in, thinking of where to vacation, or discovering a region’s most popular gluten free foods. In this text we will use data to find the best gluten free restaurants.

Since this website has a strong emphasis on celiac food safety standards of restaurants, I also look at which cities and restaurants types are generally rated more celiac-friendly, as well as the size and strength of the celiac community on the platform. As someone with several celiac family members, this Python web scraping project gives me a chance to generate insights that I can use to help protect my family.

Determining Scraping Scope

I used Scrapy to gather 22 features from 18 U.S. metro areas from, for a total of 4074 restaurants. I did not specifically choose these 18 metro areas, but instead let my scraper pull the metro areas with at least 50 showcased restaurants. This selection criteria led to some interesting results: small cities such as Nashville and Austin made the cut, while populous metros such as Atlanta and Miami did not.

This dataset shows that smaller cities can sometimes have a more robust gluten free scene than large metropolitan areas. This point is further shown on the bubble plot below, where one can see how population size does not directly correlate to a larger number of gluten free restaurants on the website.

What does the Gluten Free Dining Experience Look Like in Different Metro Areas?


This Bubble Plot shows three important aspects of the gluten free dining experience: how many restaurants one can choose from, what percentage of restaurants offer gluten free menus, and the average expected cost of a meal. By these metrics, San Diego and Denver both have the best combination of these three features, with nearly 400 gluten free restaurants to choose from, 55% offering gluten free menus, and a very reasonable ~$15 average expected cost.

Portland has the most affordable gluten free options (under $12), while Las Vegas, with the least affordable gluten free options (over $30), is over 2.5 times more expensive the cost of an average meal in Portland. In addition, a city’s cost of living does not necessarily correlate to expected dining cost, with affordable cities such as New Orleans and San Antonio commanding almost $8 more per meal than expensive cities such as San Francisco or Chicago.

When looking at gluten free menu availability by metro area, there is a surprisingly wide range, with one being over 50% more likely to find a gluten free menu in a Minneapolis restaurant than in a Seattle restaurant. Despite this advantage, Minneapolis restaurants have a far lower average rating than Seattle restaurants. This suggests that while gluten free menus generally improve the dining experience and indicate care on the restaurant’s end towards those with a gluten free / celiac diet, having a gluten free menu does not ultimately guarantee that a restaurant has delicious food or great service.

To calculate the average expected cost of a meal by metro area, I found that the four "$" options given for restaurants match the ranges $0-$10, $10-$30, $30-$60, and above $60. Assuming that the average cost of a meal in a price range is its mean, I found that the function 5x2 mapped to these means of $5, $20, and $45 perfectly, giving me a method to calculate non-integer average cost values in a mathematically appropriate way. With this function, Austin’s average cost rating of 1.67 can correspond to an average expected cost of $13.94 per meal, as shown above.

What Types of Gluten Free Food are Most Popular in each Metro Area?

The above table showcases the 10 most popular restaurant types in each metro as listed by the restaurants on their pages. There is no limit for how many of the 122 available restaurant type tags (listed on the website under “Categories”) a restaurant can have, and restaurants showcase 4.75 of these tags on average. Immediately, one sees how food sensitivities are front and center in the gluten free restaurant scene, with “Dairy-Free Friendly”, “Vegetarian Friendly”, and “Vegan Friendly” in the top 10 of most metro areas.

Other trends make sense geographically – “Seafood” and “Pescatarian Friendly” both appear in the top 10 of coastal cities of Boston, New Orleans, and Seattle, while “Mexican” and “Steakhouse” make their only top-10 appearances in San Antonio, Texas. One also sees that Takeout is ranked as the 4th or 5th most popular restaurant offering in every metro except for New Orleans and San Antonio, where it is not even ranked in the top 10. I

f one has children with gluten sensitivities, it is worth noting that only 5 of these metro areas have “Kid-Friendly” in the top 10. Oddly, Portland is the only metro with more “Lunch” places than “Restaurants”, but this may be explained by a larger-than-usual food truck scene.

Which Metro Areas have the Best-Reviewed Gluten Free Restaurants?

This stacked bar chart shows the percentage of each star rating across all restaurants in a metro area, sorted by Average Review Rating. Ratings left without reviews are not displayed individually on the website, and thus were excluded from the data used to generate rating statistics. However, as there is only a 2.5% average discrepancy between the average of all ratings vs. the average of all review ratings, using review rating data does not distort the overall data. One can immediately see that this website’s reviewers tend to be positive in nature, with 5-star reviews being the most popular rating in every metro area.

There is also a tendency to give nearly as many 1-star reviews as 2-star and 3-star reviews combined, suggesting a more critical eye for restaurants with issues. Finally, some metros appear to be more polarizing in nature, with Houston and Denver both strongly trending towards 1-star or 5-star reviews, with 83.4% and 75.4% of all reviews being one of these extremes, respectfully.

The boxplot above shows how the average review ratings of all restaurants are distributed within each metro area. Immediately, one is struck by the right-learning nature of the boxes, with at least 25% of the restaurants in 10 metro areas having a perfect 5.0 average review rating. Furthermore, for all but two metro areas, 75% of all restaurant average review ratings are above 3.5. Together, these observations confirm our earlier assertion that the community is largely favorable in their reviews of gluten free establishments.

One possible explanation may also be due to the excitement of finding a good gluten free restaurant, as many foods are difficult to find – or make – in a gluten free form. One also can see that for every metro except for Washington D.C., low-rating outliers tend to skew the mean review rating lower than the median review rating, with Portland and Boston having some of the biggest differences between these two metrics – nearly a quarter point.

Finally, the city with the narrowest interquartile range is Las Vegas, suggesting that Las Vegas has the most consistent restaurant experience across these 18 metro areas – albeit at only 62 total restaurants.

Which Metros Have the Most Celiac-Friendly Restaurants?

In this stacked bar chart, one can see how celiac friendly and celiac unfriendly each metro area is. For 15 of the 18 metro areas, most users left a celiac friendliness rating in their review, but not by much – no more than 59% of reviewers left a celiac friendliness rating in any metro area. This is not necessarily a lack of care by these reviewers, but due to the binary nature of this system – to declare a restaurant completely celiac unfriendly or completely celiac friendly is a bold statement, and one which may take more than one dining experience to judge.

As reviewers with celiac are more likely to take a critical eye to a restaurant’s celiac safety standards, I also included the percentage of celiac users in a metro area on the second y axis, with roughly 2 of 3 users (±5%) identifying as celiac across all metro areas. While there are no metros with over 70% celiac users on the bottom half of this bar chart, the proportion of celiac users is not significantly correlated to the celiac friendliness ratings, which are ultimately more dependent on restaurant quality than users.

One also see again that users tend to be positive in nature, with fewer than 1 in 6 users leaving a Celiac Unfriendly rating in every metro. However, this is not so surprising, as establishments on this website are inherently more celiac-friendly than the far larger number of restaurants that do not have any gluten free options.

At Which Types of Restaurants Should One Expect to Find Celiac-Friendly Food?

The blue horizontal bar chart above looks at the 46 restaurant type tags that appear in at least 50 restaurants and sorts them by celiac-friendliness. The most celiac friendly restaurant type is “Fine Dining”, with over 9 out of 10 reviewers finding these restaurants celiac friendly. This makes sense, as upscale establishments tend to be more ingredient-conscious. However, the three of the next four top tags are cuisines that are not usually expensive (“Latin”, “Thai”, and “Juice Bar”) and require a different explanation.

While not expensive or fancy, are three of these cuisines are naturally gluten-friendly, with corn, rice, and fresh produce being the central carbohydrates of each of these cuisines, respectfully. On the other end of this bar chart, one sees that “Fast Food” and “Deli” are two of the worst celiac bets, with roughly 7 out of 10 reviewers finding these restaurants celiac friendly. Given these establishments are based on bread sandwiches and/or breaded fried foods, these types of food do not lend themselves easily to gluten free diets.

Furthermore, these types of restaurants often have a lot of staff turnover and utilize shared prep surfaces, knives, and cooking surfaces, increasing the chance for celiac safety protocols such as cross contamination to be overlooked. Finally, in contrast to “Fine Dining”, “Fast Food” and “Deli” patrons are often not as discerning about the ingredients in their meals.

Other insights include that bakeries with gluten free options tend to be very celiac-friendly, while “Kid-Friendly” restaurants are decidedly less celiac friendly – a concerning statistic, given how many children develop celiac disease. While “Paleo Friendly” and “Keto Friendly” restaurants are very celiac friendly, “Vegetarian Friendly” restaurants are nearly 14% less celiac friendly – near the bottom of the list.

This makes sense, as both the Paleo and Keto diet avoid refined grains, while many Vegetarian menus include Seitan as a vegan protein substitute, which is literally made from gluten. However, this hypothesis does not fully explain this discrepancy, as “Vegan Friendly” restaurants are 8.4% more likely to celiac friendly than “Vegetarian Friendly” restaurants.

Which Restaurant Type Tags are the Most Popular?

Knowing the celiac percentage is not enough – if a restaurant is rated 100% celiac-friendly, but only has 2 reviews, 100% feels a lot less representative as the sample size of reviews is so minimal. As such, this final bar chart shows how many reviews are left on average for each restaurant type. While “Fine Dining” had the highest percentage of celiac friendly ratings on our previous bar chart, here it has one of the lowest average sample sizes of celiac friendly votes on the website, with fewer than 3 votes per restaurant on average.

As such, it is hard to trust its high celiac ratings on an individual restaurant when coming from such a small sample size. Conversely, “Kid Friendly” restaurants may have low celiac friendly scores, but they have the 5th highest average number of celiac friendliness reviews. As such, the most popular types of restaurants are not necessarily the highest rated, and vice versa. Interestingly, “Grocery Store” is easily the most popular establishment type on, with over 15 reviews written on average per store.

This review volume is unsurprising, as a grocery store sees an immense amount of foot traffic relative to a restaurant, leading to a larger pool of people who can write a review. Conversely, “Fine Dining” restaurants near the bottom of this list have minimal foot traffic, as many are intimate restaurants with few tables, multiple courses, and relaxed service – all of which can somewhat explain its small number of average reviews.

What Foods are Popular at Gluten Free Restaurants?

For my final visualization, I made a word/phrase map using all of the 86 unique food type tags (called ‘GF Features’) available on the website. These tags are less popular than restaurant tags, with only 2.56 average food tags per restaurant. Sizing the food tags by frequency of usage across all 4074 scraped restaurants helps one visualize which foods are the most popular at gluten free restaurants. The large volume of gluten free bakeries is shown with the large “Bread/Buns” and “Dessert” tags front and center, followed by three of the USA’s favorite foods: “Burgers”, “Pizza”, and “Pasta”.

Foods that are very difficult to make gluten free are appropriately rare, with “Cinnamon Rolls”, “Ravioli”, and “Croissants” all shown with tiny fonts. On the beverage side, "Cider" is almost always gluten free and was expected to be popular, but it was unexpected to see "Beer" so prominently displayed. Other food types are more surprising – how are there more gluten places that advertise “Deep Dish Pizza” than “Mac & Cheese”?

Final Thoughts on This Project

This project allowed me to explore the gluten free / celiac scene in many different metro areas, see which types of restaurants are the most celiac friendly, and even learn which gluten free foods are common and rare to find in restaurants with gluten free offerings. I plan on sharing the project with my celiac and gluten sensitive family and friends so that they can get a bird’s eye view of the gluten free scene they live in – or the one they want to visit someday.

The skills I demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.


About Author

David Gottlieb

Data Scientist / Solutions Architect with a passion for problem solving and applied machine learning insights. Fluent in Python, R, and SQL.
View all posts by David Gottlieb >

Related Articles

Leave a Comment

No comments found.

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI