Do Patterns of Demographics Influence Which Neighborhood is Gentrificatied?

Posted on Aug 8, 2016


Back in the 1960s, Williamsburg, Brooklyn was a manufacturing hub that lacked glamour, but promised labor to thousands of lower income residents. The neighborhood attracted large amounts of Latin American immigrants and developments of public housing projects. However, with the decline of manufacturing in the 1990s, many were left unemployed mounting to the social ills of the time: poverty, racism, poor health care and inadequate education. With walls and roads ridden with decay, it most definitely is a far cry from today’s bustling hipster and artist haven.

However, the unfortunate side effect to the massive influx of educated millennials is the soaring rent prices. The growth of demand had started to displace poorer, underprivileged residents in the community; this has forced many to live out in the suburbs, even further away from the city where many earn wages to survive paycheck to paycheck. While this is the two-sided coin of urban development, we must ask: why did Williamsburg, with Myrtle Avenue nicknamed “Murder Avenue”, suddenly boom as opposed to Flatbush, Canarsie, or Bed Stuyvesant? What clicked that made real estate developers start to look at the area as prime for gentrification this past decade? I sought to answer these questions by looking at the demographic data of New York communities from 2005 – 2014.


Source Data

The American Community Survey (ACS) provides 1-year, 3-year, and 5-year estimates between the decennial census. The level of geography I used was called PUMA (Public Use Microdata Areas) - PUMAs are 5-digit codes, used to describe population zones of around 100,000 people. It was first developed for use in the Census and ACS. There is a cooperative program between the Census Bureau and the states that allows local input to suggest boundaries for them.


The source file contains individual responses for summary:


I summarized them by counting categorical values and taking the median of the values by PUMA to create the working set of data:

Data Formatting Code

Percentages for Mapping


Page 1: PUMA Dictionary

The first part of my Shiny application is a table that allows you to understand which zones each PUMA represents. If you’re interested in a certain neighborhood, you may search for it, or find it manually with sort and next page.



Page 2: Comparison Chart

The second part of my application is a dynamic line chart that allows you to select which PUMAs of interest and which variables you would like to compare.  For example, the chart below shows the change in household income between PUMAs 04008, 03805, and 03802:



Page 3: Variable Map

Sometimes it is of interest to see the changes geographically – perhaps the demographics of nearby PUMAs or neighborhoods have a domino effect to start gentrification. The third part of my application allows you to visualize these with an overlaying map, year slider, and variable selector.




The following codes were used to create the Shiny Application:




These two codes use a javascript library (nvd3):





If we compare neighborhoods that were gentrified, and those that didn’t succeed in gentrification, we may be able to see any potential factors. I first sorted the PUMA listing by difference in housing growth between 2005 and 2014; this will allow us to see which neighborhoods developed the most, and which neighborhoods lagged behind in development.



Some of the top ten are what we expected: Williamsburg, Cobble Hill, Long Island City, etc., neighborhoods with very noticeable revitalization. On the bottom tier, we mostly have poorer neighborhoods in the Bronx and Brooklyn, such as Bronxdale and East New York. I’ve colored our two groups of Gentrified and unable to Gentrify as Blue and Red respectively.


Race and Household Type


By plotting time series plots, we may be able to see if there is a tipping point at where a neighborhood becomes gentrified. Gentrification is traditionally defined by the influx of educated white millennials (age range lower 20s – mid 30s). I plotted the growth of white populations and married households within the two groups (Gentrified as Green, while Unable to Gentrify as Purple). With the white race, all the gentrified neighborhoods have always had at least 40% white population. However, there are also poorer neighborhoods with a large population of white people (see Brighton Beach). Likewise, nothing notable can be seen in the married household demographic. Both results are inconclusive.




The above graph plots population with a bachelor’s degree and above. Education had the clearest distinction. All of the gentrified neighborhoods (blue) have at least 30% with a college degree. Williamsburg (lowest blue line) was at the bottom early in 2008, but started increasing dramatically in 2010.  This shows that education may be a potential factor in determining neighborhood development.

If we do find that education is the main primer that attracts developers, then funding into neighborhood education is of utmost importance. In order to preserve the local community, the city may also have to figure out ways to improve local education and housing strategies that do not displace longtime residents. Whether this is the case, further sociological and urban development studies will be needed. This is an issue we have yet to solve today - the reverse urban sprawl of the educated and wealthy moving into cities, and the underprivileged moving out to suburbs; By moving locals further from their workplace and community programs, gentrification all in all increases the barriers for equality. Current strategies in place by the government are rent controlled housing, and in the future subsidized technical education – both of which sound excellent for protecting the poor. New York City is and has always been a melting pot city; I would love to see it develop symbiotically, with both the locals and newcomers hand in hand.

About Author

Charles Leung

During his past three years in the manufacturing industry, Charles has discovered and developed his passion for big data – not only to solve quality and production issues but also to create tools that automated and optimized steelmaking...
View all posts by Charles Leung >

Leave a Comment

Mauricio March 20, 2017
Thanks Gabriele. You’re right, deciding how you’re going to use whichever social media outlet you choose is a big deal for business, but not such a big deal if all one wants to do is have fun. Nothing wrong with fun, though. ??

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI