Restaurant Inspections

Posted on May 18, 2014


Contributed by John Downs, Dean and Python instructor.




Recently, my first group of students for Data Analysis with Python presented their projects. I agreed to present along with them, with a project of my own. In a sense, we are all students and presenting together is a great excuse to try new tools and technologies.

I chose something relatively simple. I was interested in restaurant health inspection data, and specifically the ability to look up the grade and violations of a place I was considering for my next meal.  I have seen several applications that do an analysis of the data, but never one that answered the question "Should I eat here?".

While this project isn't very heavy on data analysis, it did rely on some tools we used in class. It provides an excellent example for how data can be presented in a useful way, once the analysis is complete.

Python 2 is officially a legacy product, so I choose Python 3.4. I am more familiar with Python 2, so I expected some challenges going forward, but I found the upgrade remarkably simple. I deployed the application to Heroku. This was also a seamless process. I selected MongoDB as my data source, hosting the database on MongoLab.

Using MongoDB was motivated by a desire to experiment with the technology. In retrospect, I found it awkward to construct queries. A relational database such as Postgres would have been a better choice.

The application itself uses a simple Model/View/Controller pattern with Flask as the web framework. There are three features I implemented.

The first is a phone number lookup to see the results of the most recent inspection for a particular restaurant. I found that the phone number is a unique key that also corresponds with the Yelp's API for business lookup.

The second feature is zipcode lookup. The zipcode is a proxy for neighborhood. This provides the inspection date and most recent grade for every establishment in the area.

The third feature is a summary for cuisine types in a borough. It allows you to view the count for each grade and see aggregate counts of each violation during the most recent round of inspections.


There are many features that could be added to make this more useful.  Things like allowing lookup by name and integrating it with review sources would be a couple of straightforward additions that would really improve the usability of the application.

The application is available at:

The source can be found on github:

The data is hosted at:

For example, I put 11377 zip code in my application, I got the following result back:
Screen Shot 2014-05-21 at 1.58.41

About Author

Related Articles

Leave a Comment

yangzh May 19, 2014
wow cool

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 airbnb Alex Baransky alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus API Application artist aws beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep Bundles California Cancer Research capstone Career Career Day citibike clustering Coding Course Demo Course Report D3.js data Data Analyst data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization Deep Learning Demo Day Discount dplyr employer networking feature engineering Finance Financial Data Science Flask gbm Get Hired ggplot2 googleVis Hadoop higgs boson Hiring hiring partner events Hiring Partners Industry Experts Instructor Blog Instructor Interview Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter lasso regression Lead Data Scienctist Lead Data Scientist leaflet linear regression Logistic Regression machine learning Maps matplotlib Medical Research Meet the team meetup Networking neural network Neural networks New Courses nlp NYC NYC Data Science nyc data science academy NYC Open Data NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time Portfolio Development prediction Prework Programming PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn Selenium sentiment analysis Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau team TensorFlow Testimonial tf-idf Top Data Science Bootcamp twitter visualization web scraping Weekend Course What to expect word cloud word2vec XGBoost yelp