Analysis Of NYC Airbnb Listings
Research Questions for NYC Listings
In this Airbnb analysis we will try to answer these questions:
- Where is the best area for Airbnb listings in NYC?
- Where are the opportunities in the market that would allow for a new lister on Airbnb to optimize their revenue?
- What are features in listings most influence the annual revenue?
Analysis Background
Recently, Airbnb’s revenue from listings nearly doubled globally from $3.88 billion in 2020 to 6.85 billion in 2021 with North America growing the most among regions . With difficult economic times for many families, listing a property with Airbnb may be an effective way to earn alternative revenue.
According to Mashvisor, as of March 2023, there are over 6.6 active listings on the platform that are managed by over 4.4 million hosts. There are also over 150 million people use Airbnb as their primary means of finding rental units. Because of the influx of new listings, listers often find it difficult to get started with Airbnb. They see a saturation of listings and are deterred from even initially listing their property. This analysis will help identify potential gaps in the NYC listings market to help listers best optimize their revenue.
Airbnb Data
The data for this analysis was gathered from Inside Airbnb, specifically in New York City from the month of December. Although the dataset may not incorporate seasonality for the rest of the year, the dataset has columns with predicted availability over the next 365 days for each listing. To get a better understanding of the columns in this dataset, the data dictionary can be found at this link.
Initial Findings
To to an initial analysis of the dataset and how each of the features influence each other, I created a Pearson's Correlation Heat-map to try and measure how each feature influences each other. This is seen in this figure.
From this we see that the price has a positive weak correlation with the number of listings a host has. It would seem that as the number of listings increases, the price of each listing also increases. This may be because many of the listings are in an area where the prices are just naturally higher such as Manhattan. It also seems that there is a weak correlation between the availability of a listing and its price. This is interesting because the availability_365 column is defined as the number of days a listing is available in the next 365 days and it would seem that as the number of days a listing is available increases, the price also increases. Although this is seems counterintuitive, in the dataset dictionary it states that a listing may be unavailable due to the owner of the property blocking the listing from being booked, or because the listing already has already been booked. This either shows that many of the listings in the used dataset are blocked by the lister, or that there truly is a negative relationship between these two variables. To explore more, we would need to look at another dataset of all of the listers in NYC and delve deeper into the types of listings there are.

Above is an initial heat map showing where the heaviest concentration of listings in NYC are. It shows that there is a large concentration of Listings in Manhattan and Brooklyn. As a new lister, this is logical because there are more tourist attractions in those areas such as Central Park, The Empire State Building, etc.
To compare the listings and how successful they were, we created a column in the data frame named "annual_revenue" which was calculated from the availability_365 column and the price column. This was then added to each listing to show their estimated annual revenue. The neighborhoods with the highest average annual revenue are as follows.
- Chelsea - $119443.27
- Brooklyn Heights - $105301.77
- Theater District - $102203.87
- SoHo - $96167.34
- Tribeca - $89216.63
- Financial District - $82792.28
- West Village - $79576.62
- Greenwich Village - $77590.37
- Midtown - $77515.15
- Boerum Hill - $77406.62
- NoHo - $72173.75
- Downtown Brooklyn - $69474.00
- Battery Park City - $67842.68
- Hell's Kitchen - $65484.57
- Kips Bay - $63182.99
- Columbia St - $62889.90
- Vinegar Hill - $62258.82
- East Village - $61701.07
- Nolita - $59398.92
- Cobble Hill - $57816.72
From this figure, we can see that the annual revenue for the listings in those neighborhoods are in the range of $60,000-$110,000 a year. To get a better understanding, we would need to look Into the categorical descriptors.
From the right figure above, it seemed as if the borough had a correlation with the annual revenue. When further explored it can be seen that Manhattan has the greatest median price, while the Bronx has the lowest.
The next step in the analysis is to break the listings up into price bins. We did this because this then allowed for further analysis of all of the boroughs. We can now look at a price range where a lister should optimally list their property. Once done for all of the boroughs, some gaps in the market may appear.
Below are line charts for each borough and it shows the number of listings in the borough, the average days a listing is not booked, and the estimated annual revenue calculated from average nights books multiplied by the nightly rate compared to its price bin. A spike in the chart may show where a price bin is popular in the borough and may unveil an optimized range for a listing in a borough.
Brooklyn Analysis



From these charts we can see that the majority of the listings from this dataset are from Brooklyn in the lower price bins. However, we can conclude from the graph on the right it that even though the availability is higher, the price . When looking at the availability of that price bin in the second chart, it seems that
Manhattan Analysis



In this analysis it can be seen that the $980-$990 price bin again creates more annual revenue than its neighboring price bins. In Manhattan there are also more listings around that price range meaning that there is already a market at that price range. This may be because people finding listings have a budget around $1000 a night, and are specifically looking for the most in that price range. It also shows that if a lister wanted, lowering or raising the price to that specific price bin may bear higher annual revenue.
Queens Analysis



From this, it can be seen that there is a large increase in estimated annual revenue at around the $820-$830 price bin. When explored further, it can also be seen that there is also more demand for listings in this price bin. It is showcased in the middle figure, where there a steep drop in the number of days a listings is available in the price range. The neighboring price bins also have very similar number of listings as the $820-$830 price bin, but with far less estimated annual revenue. This shows that there may be a market for listings in that price range. This can also be seen in the $980-$990 price bin.
The Bronx Analysis



It can be seen that there are only about 250-300 listings in the Bronx. From the third chart it can be seen there is an increase in annual revenue at around the $250 price bin. When explored further
Staten Island Analysis



From these charts we can see that there are not many listings in Staten Island from our dataset. The estimated annual revenue spike near the right end of the graph can be attributed tot the fact that there are less than 10 listings in that price range in Staten Island. This is considered noise caused by a low sample size. This either means that there truly are not many listings in Staten Island or the data frame we used does not include all listings from Staten Island. To get a more in depth look, we would need a Staten Island specific dataset which may provide the insights this dataset is missing.
Analyzing the Type of Room
From the earlier analysis we can see that a majority of the listings are in Manhattan and Brooklyn. When looking closer into which types of room generate the more revenue, we find each borough has a different optimal room type. Below are some analyses which find potential markets for new listers with a specific listing type.
Brooklyn Home



From these graphs we can find potential markets as well as parts of the market which listers should avoid. There is a price bin at $990-$1000 which generates more revenue than its neighboring price bins while also having a lower availability. There is also an increase in listings in that price point. It can also be seen that the price bin at $860-$870 should be avoided. The graphs show that there is less estimated revenue, increased availability while maintaining roughly the same number of listings. This means that the listings at that price point are not as popular as other price bins.
Manhattan Private



When looking at private room type listings in Manhattan, it can be seen again that there is a price point at $990-$1000 which is very popular. Again, there is a drastic increase in estimated annual revenue, a decrease in availability, and increase in number of listings. This means that the price bin is very popular among people trying to find a listing. If a lister owns a private room in Manhattan, he/she should look to price their room at around $990-$1000. Listers should avoid the $400-$420 price range because although they have more listings, the estimated annual revenue is less than its neighboring price bins, and the availability is greater than its neighboring price bins.
Limitations to the Analysis
Even though the analysis shows the price points in which listings are popular, it does not explain why they are popular. To get a better picture, more details are needed such as the amenities provided, the neighboring businesses, and cost of living in the area. For example, it is much more expensive to live in certain parts Tribeca than it is to live in certain parts in Chinatown.