NYC Real Estate Market Data Analysis and Visualization
The skills the author demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.
As a previous real estate consultant, I've found that often, the existing real estate data providers don't break down property classes in each neighborhood in New York City. And to do so, one would require a large amount of data with an efficient tool that can handle the amount of data; excel often falls short when the size of the data gets bigger. This project is designed to achieve this task by visualizing and charting the sale transactions in New York City, categorized by properties and neighborhoods, using R and Shiny. Data I used the NYC Rolling Sales data to do the data analysis. The reason why I want to choose this dataset is as follows:
- 50,000+ rows for each file - hard to navigate with excel
- Constantly updated every month, can track back to 2003
- Rich and factual information but with messy format and noise
- Provides information that industry reports don't provide
The more important question is how do you create business values from the dataset, how do I differentiate from the tools that other data websites provide?
The following is a simple comparison between several widely used data providers in the industry and the project to demonstrate where the business values are.
- Only focus on residential
- No past transactions on a property level
- No trend graphs
- focuses on every major property class
- With past transactions on the property level
- With trend analysis
Real Capital Data Analytics
- They only provide records that go back to limited number of years and it doesn't provide trend analysis
- It doesn't provide unit level data
- Can be extended several years back
- Provides unit-level data
When people think of New York, they think of Manhattan, Queens and Brooklyn. And that's where the brokerage firms mainly cover for their industry research reports. However, the Bronx and Staten Island real estate markets are not covered. And those two borough's real estate markets have performed really well in the past year.
- Combine more historical data dating back to 2003
- More cleaning and more real estate metrics
- Predict neighborhoods that have investment potential in the future