NYC Data Science Academy| Blog
Bootcamps
Lifetime Job Support Available Financing Available
Bootcamps
Data Science with Machine Learning Flagship 🏆 Data Analytics Bootcamp Artificial Intelligence Bootcamp New Release 🎉
Free Lesson
Intro to Data Science New Release 🎉
Find Inspiration
Find Alumni with Similar Background
Job Outlook
Occupational Outlook Graduate Outcomes Must See 🔥
Alumni
Success Stories Testimonials Alumni Directory Alumni Exclusive Study Program
Courses
View Bundled Courses
Financing Available
Bootcamp Prep Popular 🔥 Data Science Mastery Data Science Launchpad with Python View AI Courses Generative AI for Everyone New 🎉 Generative AI for Finance New 🎉 Generative AI for Marketing New 🎉
Bundle Up
Learn More and Save More
Combination of data science courses.
View Data Science Courses
Beginner
Introductory Python
Intermediate
Data Science Python: Data Analysis and Visualization Popular 🔥 Data Science R: Data Analysis and Visualization
Advanced
Data Science Python: Machine Learning Popular 🔥 Data Science R: Machine Learning Designing and Implementing Production MLOps New 🎉 Natural Language Processing for Production (NLP) New 🎉
Find Inspiration
Get Course Recommendation Must Try 💎 An Ultimate Guide to Become a Data Scientist
For Companies
For Companies
Corporate Offerings Hiring Partners Candidate Portfolio Hire Our Graduates
Students Work
Students Work
All Posts Capstone Data Visualization Machine Learning Python Projects R Projects
Tutorials
About
About
About Us Accreditation Contact Us Join Us FAQ Webinars Subscription An Ultimate Guide to
Become a Data Scientist
    Login
NYC Data Science Acedemy
Bootcamps
Courses
Students Work
About
Bootcamps
Bootcamps
Data Science with Machine Learning Flagship
Data Analytics Bootcamp
Artificial Intelligence Bootcamp New Release 🎉
Free Lessons
Intro to Data Science New Release 🎉
Find Inspiration
Find Alumni with Similar Background
Job Outlook
Occupational Outlook
Graduate Outcomes Must See 🔥
Alumni
Success Stories
Testimonials
Alumni Directory
Alumni Exclusive Study Program
Courses
Bundles
financing available
View All Bundles
Bootcamp Prep
Data Science Mastery
Data Science Launchpad with Python NEW!
View AI Courses
Generative AI for Everyone
Generative AI for Finance
Generative AI for Marketing
View Data Science Courses
View All Professional Development Courses
Beginner
Introductory Python
Intermediate
Python: Data Analysis and Visualization
R: Data Analysis and Visualization
Advanced
Python: Machine Learning
R: Machine Learning
Designing and Implementing Production MLOps
Natural Language Processing for Production (NLP)
For Companies
Corporate Offerings
Hiring Partners
Candidate Portfolio
Hire Our Graduates
Students Work
All Posts
Capstone
Data Visualization
Machine Learning
Python Projects
R Projects
About
Accreditation
About Us
Contact Us
Join Us
FAQ
Webinars
Subscription
An Ultimate Guide to Become a Data Scientist
Tutorials
Data Analytics
  • Learn Pandas
  • Learn NumPy
  • Learn SciPy
  • Learn Matplotlib
Machine Learning
  • Boosting
  • Random Forest
  • Linear Regression
  • Decision Tree
  • PCA
Interview by Companies
  • JPMC
  • Google
  • Facebook
Artificial Intelligence
  • Learn Generative AI
  • Learn ChatGPT-3.5
  • Learn ChatGPT-4
  • Learn Google Bard
Coding
  • Learn Python
  • Learn SQL
  • Learn MySQL
  • Learn NoSQL
  • Learn PySpark
  • Learn PyTorch
Interview Questions
  • Python Hard
  • R Easy
  • R Hard
  • SQL Easy
  • SQL Hard
  • Python Easy
Data Science Blog > Student Works > Becoming a Successful Airbnb Host in NYC

Becoming a Successful Airbnb Host in NYC

Emily In
Posted on Aug 24, 2023
  1. Introduction

In 2022 New York City ranked among the world’s ‘most powerful’ tourism cities, according to the World Travel & Tourism Council (WTTC).  In that year alone, NYC attracted around 56 million tourists, a figure that was expected to increase to 61 million in 2023. From the initial establishment of the Airbnb platform in 2008, to its rebrand in 2014, up until today, it has become a household name for vacation and short term rentals. And now, there are more Airbnb listings in NYC than there are rental listings in NYC, so it is not surprising that people may consider investing in Airbnb properties.

2. Research Questions

The purpose of this project is to identify what indicators Airbnb property investors should consider to maximize potential profits.

The exploratory data analysis seeks to answer the following questions:

  • How does someone become a successful Airbnb host in NYC?
  • What makes a listing popular?
  • For someone aspiring to be an Airbnb host, which areas in NYC would be the most profitable? 
  • What are the trends saying about the behavior of NYC visitors?
  • How can data be leveraged to improve the online booking experience?

3. The Data

The dataset I chose for this analysis was the Airbnb Open Data NYC Dataset found on Kaggle. As this was an already cleaned version,  not much additional cleaning was necessary for this dataset. Smaller, additional cleaning was performed, including, rounding down integer values to two decimal places, correcting misspelled column names, and changing certain column values to boolean type values, such as “Yes” or “No.” This dataset showed listing activity of Airbnb listings in NYC with construction years ranging from 2003 – 2022. 

4. Exploratory Data Analysis

To start off, a heat map was plotted to show a visualization of which room types were most prominent in each borough. Brooklyn and Manhattan accounted for a majority of both overall Airbnb listings and all entire home/apartment type listings. 

A picture containing text, screenshot, diagram, colorfulness

Description automatically generated

This is most likely due to higher demand. Manhattan is the center of NYC tourism, with Brooklyn coming in second. Identifying where each type of room is most aggregated can be useful for future listers who want to determine where there are opportunities for growth and which areas may be too saturated to be competitive with other listers.

Most in Demand Room Types Per Borough

Using the listings with the highest amount of reviews as the measuring metric, we determined which rooms were most popular.  On the Airbnb platform, only travelers who have completed a stay are able to leave reviews. As a review on AirBnB correlates directly to a booking,  the review count can work as a measure of popularity.

To get a picture of where the current demand was, the data was filtered by the ‘last_reviewed’ column values, with only the ones that were last reviewed in 2022. This separate data frame was then used to create a bar chart to show which borough and room type combinations were accumulating the highest number of reviews. The results showed that entire home apartments/homes in Brooklyn received significantly more reviews in 2022 than other room/borough combinations. Results may be slightly skewed toward Brooklyn and Manhattan entire apartments/home listings due to their representing the majority of overall Airbnb listings in New York City. The Brooklyn neighborhood has more open space and less hectic environment in comparison to busy streets of Manhattan.  It’s possible that travelers are preferring to stay in more private and less busy areas.

A picture containing text, screenshot, line, plot

Description automatically generated

Highly Saturated Neighborhoods in NYC

The data reveals the top 10 neighborhoods with the largest number of listings. The results of this bar graph are able to show potential future listers what areas they might want to avoid investing in. The more saturated the neighborhood is already with listings, the harder it will be to remain competitive and receive a greater ROI on the property.

Analyzing the Distribution of High-Rated Rooms

The data was then grouped by only the listings that received either a 4.0 or 5.0, measured as good and great respectively. The distribution of ratings per room type was also dependent on the total number for each room type. The greater the number of available listings for a particular room type, the higher the likelihood of receiving higher ratings, and conversely, the lower the likelihood of receiving lower ratings. Of a total of 69,305 listings, there were 37,212 entire home/apartment listings and 30,508 private rooms. Together, they made up around 97% of all listings, which explains why these two room types make up a majority of the high rated listings.

A picture containing text, screenshot, diagram, circle

Description automatically generated

Almost the same patterns in distribution were seen when highly rated rooms were grouped with each borough and room type as well. The data was consistent with where and which type of listings were most saturated: Manhattan, Brooklyn and entire home/apartments and private rooms.

A picture containing text, screenshot, diagram, circle

Description automatically generated
A pie chart with text

Description automatically generated with low confidence

Year-Round Available Listings (365 days)

After grouping the data by high-rated listings that were available 365 days, a pie chart was used to visualize the top 10 results. This was almost 100% consistent with the top 10 neighborhoods with the most number of listings. 

A picture containing text, screenshot, diagram, circle

Description automatically generated

Identifying Common Description Words&Phrases in the Top 100 Most Reviewed (Review Count)

In order to determine if there were specific keywords and phrases that were commonly associated with the most booked listings, a word count was done on the top 100 most reviewed listings. Any words that appeared more than two times among the 100 most reviewed listings were pulled and organized into a word cloud. In addition, any filler words or symbols, such as &, of, -, +, !, the, etc. were excluded in the word count.

We found that location specific keywords were significant. Words regarding proximity to nearby airports and the subway were observed and also the words, “Private” and “Quiet”, suggested that travelers prefer to stay in quieter areas with more privacy. “No cleaning fee” was seemingly also important but not preferred by travelers. And consistent with the demand, Brooklyn and Manhattan were also among the most frequently appearing words in the top 100 most popular listings.  

A close-up of words

Description automatically generated with low confidence

This analysis will help Airbnb owners pick the right words and phrases when describing their listings to help improve their visibility in search results. Greater visibility should increase the number of bookings for their properties

A Look at Seasonality Over Time

In order to measure seasonality most accurately, this dataset would have needed to have the actual booking dates recorded. In order to get a picture of the seasonality demand over time as closely as possible, a scatterplot was set up to visualize the number of reviews (y axis) against the last date it was reviewed ( x axis). The last reviewed date column values were reformatted into just the month and year values for ease of visualization. 

January, February and March 2022 saw the highest spike in number of reviews, with June and July 2019 following closely behind. January to March is usually considered off-season for travel,  so these months would be the cheapest time to travel. In addition, 2022 saw international travel surpass pre-pandemic levels. People were most likely more comfortable traveling in 2022 compared to 2020 and 2021, which were the peak times of the pandemic.

A picture containing text, screenshot, plot, line

Description automatically generated

Airbnb Stay Attributes 

Another important thing to know is if any attributes associated with the listing had an effect on the booking rate, price, and ratings.  Listings with a one night minimum stay saw the most amount of bookings and reviews. A box plot was used to see if the cancellation policy (ranging from flexible, moderate to strict) and instant bookability had an effect on price, and it was found that they did not.  

A picture containing text, diagram, line, number

Description automatically generated
A picture containing text, line, diagram, number

Description automatically generated

House rules also had no effect on a listing’s rating. 

A picture containing screenshot, diagram, rectangle, text

Description automatically generated

Determining the Most Optimal Price for Each Area and Room Type

As a potential investor in a property to list on Airbnb, it’s important to determine where the price distribution lies to get a sense of how a listing should be priced. When determining the most optimal price for a listing, it was important to first determine the median pricing for each borough and room type. Compared to the mean, the median is better at determining central tendency for skewed distributions since it is much more robust and sensible.

Based on the results, optimal pricing for someone looking to invest in a property to list on Airbnb would have to consider a price range between $621 - $650 for entire apartments and private rooms if they want to be competitive in the market.

A picture containing text, screenshot, diagram, plot

Description automatically generated
A picture containing text, screenshot, plot, diagram

Description automatically generated
A picture containing text, screenshot, plot, diagram

Description automatically generated

Current Demand of Still Active Listings

Lastly, it was important to see where the current and most recent demand lied, so focus was put on specific NYC neighborhoods by filtering out the top 10 areas that had the most amount of bookings/reviews in 2022 alone. This was also used as a metric to see which listings were still active, as well. Of the listings that are still active, the Bedford Stuyvesant neighborhood in Brooklyn had substantially more bookings in 2022 compared to other neighborhoods. This also aligns with the previous analysis results of Brooklyn entire apartments being the most in demand.

5. Conclusion

Based on this exploratory analysis of the given dataset, location is a clear factor that travelers take into consideration when booking a room on Airbnb.

Proximity and closeness to specific areas, such as subways, airports, popular areas like Manhattan and Brooklyn, were also important decision factors. Analyzing the most common keywords associated with popular listings will become meaningful data that can be used to establish an algorithm within Airbnb to put more weight on certain keywords based on city and specific neighborhoods. When deciding how to describe a listing to reach the target traveler faster, these factors and keywords should be taken into account.

It is clear that entire apartments and private rooms in Brooklyn and Manhattan are the most popular boroughs and room types respectively. However, highly saturated areas like Brooklyn and Manhattan are most likely too saturated with listings already, making it that much more difficult to stay competitive against other listings. That means an investor should carefully consider which section of these popular boroughs to invest in.

6. Future Works

There were quite a few more areas that required further exploration in order to obtain more accurate business insights. Data on which listings were Superhosts or regular hosts would have been very helpful in determining their impact on booking rate. In addition, data on exact booking dates, length of stays, and which days of the week the bookings occurred could give us insight into which specific days and periods of the year are seeing the most travel activity. 

Finally, information on each listing’s reviews could also be used to perform a common word/phrase count as we did with the listing descriptions as well. This will help identify whether a listing’s reviews are mostly positive or negative, so that we may differentiate between reviews and quality of the listings. A review on Airbnb may mean a booking, but it does not necessarily mean it’s a positive review. 

About Author

Emily In

After earning my Bachelor degree in Sociology and working as an SEO Analyst, I discovered my interest in researching and deriving business insights and stories from data. A family matter required me to leave my job and work...
View all posts by Emily In >

Leave a Comment

No comments found.

View Posts by Categories

All Posts 2399 posts
AI 7 posts
AI Agent 2 posts
AI-based hotel recommendation 1 posts
AIForGood 1 posts
Alumni 60 posts
Animated Maps 1 posts
APIs 41 posts
Artificial Intelligence 2 posts
Artificial Intelligence 2 posts
AWS 13 posts
Banking 1 posts
Big Data 50 posts
Branch Analysis 1 posts
Capstone 206 posts
Career Education 7 posts
CLIP 1 posts
Community 72 posts
Congestion Zone 1 posts
Content Recommendation 1 posts
Cosine SImilarity 1 posts
Data Analysis 5 posts
Data Engineering 1 posts
Data Engineering 3 posts
Data Science 7 posts
Data Science News and Sharing 73 posts
Data Visualization 324 posts
Events 5 posts
Featured 37 posts
Function calling 1 posts
FutureTech 1 posts
Generative AI 5 posts
Hadoop 13 posts
Image Classification 1 posts
Innovation 2 posts
Kmeans Cluster 1 posts
LLM 6 posts
Machine Learning 364 posts
Marketing 1 posts
Meetup 144 posts
MLOPs 1 posts
Model Deployment 1 posts
Nagamas69 1 posts
NLP 1 posts
OpenAI 5 posts
OpenNYC Data 1 posts
pySpark 1 posts
Python 16 posts
Python 458 posts
Python data analysis 4 posts
Python Shiny 2 posts
R 404 posts
R Data Analysis 1 posts
R Shiny 560 posts
R Visualization 445 posts
RAG 1 posts
RoBERTa 1 posts
semantic rearch 2 posts
Spark 17 posts
SQL 1 posts
Streamlit 2 posts
Student Works 1687 posts
Tableau 12 posts
TensorFlow 3 posts
Traffic 1 posts
User Preference Modeling 1 posts
Vector database 2 posts
Web Scraping 483 posts
wukong138 1 posts

Our Recent Popular Posts

AI 4 AI: ChatGPT Unifies My Blog Posts
by Vinod Chugani
Dec 18, 2022
Meet Your Machine Learning Mentors: Kyle Gallatin
by Vivian Zhang
Nov 4, 2020
NICU Admissions and CCHD: Predicting Based on Data Analysis
by Paul Lee, Aron Berke, Bee Kim, Bettina Meier and Ira Villar
Jan 7, 2020

View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day ChatGPT citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay football gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income industry Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI

NYC Data Science Academy

NYC Data Science Academy teaches data science, trains companies and their employees to better profit from data, excels at big data project consulting, and connects trained Data Scientists to our industry.

NYC Data Science Academy is licensed by New York State Education Department.

Get detailed curriculum information about our
amazing bootcamp!

Please enter a valid email address
Sign up completed. Thank you!

Offerings

  • HOME
  • DATA SCIENCE BOOTCAMP
  • ONLINE DATA SCIENCE BOOTCAMP
  • Professional Development Courses
  • CORPORATE OFFERINGS
  • HIRING PARTNERS
  • About

  • About Us
  • Alumni
  • Blog
  • FAQ
  • Contact Us
  • Refund Policy
  • Join Us
  • SOCIAL MEDIA

    © 2025 NYC Data Science Academy
    All rights reserved. | Site Map
    Privacy Policy | Terms of Service
    Bootcamp Application