Best Bars in New York City

Posted on Oct 22, 2018
Image source:


| Motivation |

Many people go bars to connect, relax, have fun, and meet people. While others go to put an end to the monotonous life, stay in touch with friends, be seen, be heard, listen to music, watch games, etc. Whatever may be the intention of going, bars provides social lubricant to relax people. My sole intention of this project was to answer my friend’s question “Which is the best bar in New York City?” that I was unable to answer quantitatively when he asked me before his visit here. Prior to this project, I did not have any quantitative information regarding bars other than  reviewing Yelp search results or other similar applications for bars. With this project I intended to update my understanding of bars around New York City with my own quantitative measurements .

| Questions expected to be answered |

Which neighborhood in New York City has the most active night life? Which are the best bars in New York City? Which day of the week is best and worst to go bars? What percentage of bars are wheel chair accessible? What percentage of bars have happy hours, bar TV, own parking, etc.

| Methods and tools |

In order to collect data about bars in New York city, I scraped Yelp using Scrapy tool written in Python. All data cleaning, analysis and data visualization were performed in Pandas and NumPy. All of my coding including the data can be found in following git hub: link

| Neighborhoods with most active Night life |

 Before diving into the best New York City bars, I wanted to find out which neighborhood had the most active night life in New York. To accomplish this, I created a bar plot demonstrating the number of bars versus neighborhood. From this bar plot, the top five neighborhoods with most active night life were found to be Mid town West, Mid town East, East Village, Upper East Side, and West Village.


| Best Bars in New York City |

 In order to find best bars in New York City, I created a "popularity index", defined by the product of the number of reviews and the bar ratings listed in the Yelp website. The best five bars in New York City on the basis of popularity index are shown below. Moreover, best bars were also found to have price range in the less expensive region.

| Best and worst night for going bars | 

The best and worst night of a week to go bars were calculated from both the popularity index, and the best nights listed on individual bars page. For example, if the bar has listed the best night to be Friday, it was given value 1 for Friday and rest of the days in week were given zero. Then values for the particular day of week was multiplied with the popularity index of each bar which was then summed over all the bars. Finally, whichever day of the week has the highest popularity index value was assigned the best night and that with the least value was worst night to go bars. The histogram of various night with popularity index is shown below:

| Other useful informations |

From the data I collected from the Yelp website, I calculated various percentage of different facilities in bars. In New York city, 30 percent bars are wheel chair accessible. Only 8.15 percent have bar dancing facility and 9.44 percent have their own parking garage. The percentage of bars that provide reservations, happy hours, and with bar TV are 51.23,  53.04, and 53.56, respectively.

| Conclusions |

I hope with these informations about bars will be helpful to choose your best bars in New York City. From business point of view, this project provides areas to improve such as bar parking, bar dancing, etc. in order to have successful bars in New York City.

| Future directions |

It would be nice to additionally collect more information about male to female ratio in each bar by scraping the individual reviewers for each bar. Male to female ratio might help people to choose right bar to go according to their interest. Finding zones of popular drinking site (may be using heat map) might provide driving industry new area to focus to expand their business in future.

About Author

Basant Dhital

Basant Dhital is a Physics Ph.D. with an excellent background in Mathematics and Statistics and demonstrated programming skills. During his Ph.D. research, he developed several algorithms to process and analyze NMR and other spectroscopic data. He developed a...
View all posts by Basant Dhital >

Leave a Comment

Basant Dhital January 16, 2020
I, later on, edited on writing to address some statistical flaws but on the presentation slide maybe I forgot to do it. I don't remember about the six-star review. Where did you find six-star review? I don't remember.
Lexi De Veaux January 15, 2020
This is awesome, Basant ! In this writeup you define "popularity index" as the product of the number of reviews and the bar ratings listed in yelp. However, in your presentation from the Github link I noticed you defined the "popularity index" as number of reviews/6-star reviews. Could you expand a little on what exactly is a 6-star review? On yelp, reviews only go up to 5-stars. Thanks !

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 airbnb Alex Baransky alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus API Application artist aws beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep Bundles California Cancer Research capstone Career Career Day citibike clustering Coding Course Demo Course Report D3.js data Data Analyst data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization Deep Learning Demo Day Discount dplyr employer networking feature engineering Finance Financial Data Science Flask gbm Get Hired ggplot2 googleVis Hadoop higgs boson Hiring hiring partner events Hiring Partners Industry Experts Instructor Blog Instructor Interview Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter lasso regression Lead Data Scienctist Lead Data Scientist leaflet linear regression Logistic Regression machine learning Maps matplotlib Medical Research Meet the team meetup Networking neural network Neural networks New Courses nlp NYC NYC Data Science nyc data science academy NYC Open Data NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time Portfolio Development prediction Prework Programming PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn Selenium sentiment analysis Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau team TensorFlow Testimonial tf-idf Top Data Science Bootcamp twitter visualization web scraping Weekend Course What to expect word cloud word2vec XGBoost yelp