NYC Data Science Academy| Blog
Bootcamps
Lifetime Job Support Available Financing Available
Bootcamps
Data Science with Machine Learning Flagship ๐Ÿ† Data Analytics Bootcamp Artificial Intelligence Bootcamp New Release ๐ŸŽ‰
Free Lesson
Intro to Data Science New Release ๐ŸŽ‰
Find Inspiration
Find Alumni with Similar Background
Job Outlook
Occupational Outlook Graduate Outcomes Must See ๐Ÿ”ฅ
Alumni
Success Stories Testimonials Alumni Directory Alumni Exclusive Study Program
Courses
View Bundled Courses
Financing Available
Bootcamp Prep Popular ๐Ÿ”ฅ Data Science Mastery Data Science Launchpad with Python View AI Courses Generative AI for Everyone New ๐ŸŽ‰ Generative AI for Finance New ๐ŸŽ‰ Generative AI for Marketing New ๐ŸŽ‰
Bundle Up
Learn More and Save More
Combination of data science courses.
View Data Science Courses
Beginner
Introductory Python
Intermediate
Data Science Python: Data Analysis and Visualization Popular ๐Ÿ”ฅ Data Science R: Data Analysis and Visualization
Advanced
Data Science Python: Machine Learning Popular ๐Ÿ”ฅ Data Science R: Machine Learning Designing and Implementing Production MLOps New ๐ŸŽ‰ Natural Language Processing for Production (NLP) New ๐ŸŽ‰
Find Inspiration
Get Course Recommendation Must Try ๐Ÿ’Ž An Ultimate Guide to Become a Data Scientist
For Companies
For Companies
Corporate Offerings Hiring Partners Candidate Portfolio Hire Our Graduates
Students Work
Students Work
All Posts Capstone Data Visualization Machine Learning Python Projects R Projects
Tutorials
About
About
About Us Accreditation Contact Us Join Us FAQ Webinars Subscription An Ultimate Guide to
Become a Data Scientist
    Login
NYC Data Science Acedemy
Bootcamps
Courses
Students Work
About
Bootcamps
Bootcamps
Data Science with Machine Learning Flagship
Data Analytics Bootcamp
Artificial Intelligence Bootcamp New Release ๐ŸŽ‰
Free Lessons
Intro to Data Science New Release ๐ŸŽ‰
Find Inspiration
Find Alumni with Similar Background
Job Outlook
Occupational Outlook
Graduate Outcomes Must See ๐Ÿ”ฅ
Alumni
Success Stories
Testimonials
Alumni Directory
Alumni Exclusive Study Program
Courses
Bundles
financing available
View All Bundles
Bootcamp Prep
Data Science Mastery
Data Science Launchpad with Python NEW!
View AI Courses
Generative AI for Everyone
Generative AI for Finance
Generative AI for Marketing
View Data Science Courses
View All Professional Development Courses
Beginner
Introductory Python
Intermediate
Python: Data Analysis and Visualization
R: Data Analysis and Visualization
Advanced
Python: Machine Learning
R: Machine Learning
Designing and Implementing Production MLOps
Natural Language Processing for Production (NLP)
For Companies
Corporate Offerings
Hiring Partners
Candidate Portfolio
Hire Our Graduates
Students Work
All Posts
Capstone
Data Visualization
Machine Learning
Python Projects
R Projects
About
Accreditation
About Us
Contact Us
Join Us
FAQ
Webinars
Subscription
An Ultimate Guide to Become a Data Scientist
Tutorials
Data Analytics
  • Learn Pandas
  • Learn NumPy
  • Learn SciPy
  • Learn Matplotlib
Machine Learning
  • Boosting
  • Random Forest
  • Linear Regression
  • Decision Tree
  • PCA
Interview by Companies
  • JPMC
  • Google
  • Facebook
Artificial Intelligence
  • Learn Generative AI
  • Learn ChatGPT-3.5
  • Learn ChatGPT-4
  • Learn Google Bard
Coding
  • Learn Python
  • Learn SQL
  • Learn MySQL
  • Learn NoSQL
  • Learn PySpark
  • Learn PyTorch
Interview Questions
  • Python Hard
  • R Easy
  • R Hard
  • SQL Easy
  • SQL Hard
  • Python Easy
Data Science Blog > R Visualization > Determinants of asthma throughout NYC

Determinants of asthma throughout NYC

Zach Stone
Posted on Dec 25, 2022

Asthma in NYC

A community health profile identified asthma as one of the leading causes of avoidable hospitalizations in the the Mott Haven neighborhood in the Bronx. The neighborhood has the highest child asthma hospitalization rate and the third-highest rate of avoidable adult asthma hospitalizations in the city. The profile suggested a number of preventable possible causes/triggers, including:

  • Air quality (specifically, fine particulate matter of a certain size, denoted PM2.5)
  • Housing-quality related exposure to triggers, like cockroaches, mice, and secondhand smoke

These determinants were supported by some descriptive statistics about particulate matter and housing quality. The goals of this research were to see if these determinants generalized across the city, and to verify or exclude determinants using statistical tests. We found:

  • Asthma rates vary significantly by location across the city, and they are elevated near Mott Haven.
  • Moreover, high asthma rates are localized to a few "hotspots", namely: South Bronx/Harlem, Lower Manhattan, North Brooklyn, North Staten Island, and the Rockaways.
  • Determinants are not uniform across hotspots.
  • Even in areas in the middle 50% of O3 (ozone) density, O3-related asthma hospitalizations are highly elevated in areas where the asthma rate is high.
  • The South Bronx/Harlem and North Brooklyn hotspots fit this pattern. The Rockaways are an outlier (high O3 and asthma rates without elevated O3-related asthma hospitalizations), and O3 does not account for the elevated hospitalizations and diagnoses in Lower Manhattan.
  • On the other hand, the Lower Manhattan hotspot has significantly elevated PM2.5 levels.
  • Locations with more tobacco availability per capita do have slightly elevated hospitalization rates among beneficiaries diagnosed with asthma. Across all of NYC, smelling secondhand smoke regularly within the home does correspond with a significance increase in asthma diagnoses.
  • Rat sightings, a proxy for housing quality, are not independent asthma diagnoses across the city; increased sightings are associated with increased diagnoses.
  • Asthma diagnosis rates are significantly increased in high-poverty neighborhoods across the city.

While some of these results corroborate Mott Haven's community health profile, they show that the determinants are regionalized within the city. Hence, public health initiatives targeting high asthma rates requires a regionalized approach, as determinants which may be nearly irrelevant in one area may be highly relevant in another.

Asthma rates vary significantly by location

In order to determine the distribution of asthma diagnoses throughout the city, we looked at publicly available data on Medicaid beneficiaries. The percentage of beneficiaries who have received an asthma diagnoses within each zip code was computed. First, a ฯ‡2 test confirmed that asthma rates were not independent of zip code (p < 2.2 x 10-16).

The distribution of asthma diagnosis rates across the city are not independent of zip code with some areas significantly higher or lower than others The highest rate is in the Mott Haven area corroborating the results of the community health profile

The diagnosis rates corroborate the inferences drawn from hospitalizations in the community health profile: the highest rates are concentrated in the Mott Haven area. Moreover, many of the areas with elevated rates are contiguous. To identify these "hotspots", each rate was compared to the citywide average with a binomial test, and those locations where the percentage was significantly elevated (p<0.05) were identified. The zip codes with significantly elevated diagnosis rates are shown below.

Many of the zip codes with significantly elevated asthma diagnosis rates are contiguous. The contiguous areas with elevated asthma diagnosis rates cluster into a few major "hotspots". We will loosely refer to these hotspots as Bronx/Harlem, North Brooklyn, Lower Manhattan, the Rockaways, and (North) Staten Island. We can also compare these to the hospitalization rates. For each zip, we look at the proportion of beneficiaries with asthma who have at least one asthma-related ER in a year, averaged by year.

We see that in many of the same areas where the proportion of beneficiaries with asthma diagnoses are increased, so is the rate of ER visits among those diagnoses. That is, not only are diagnoses elevated in those areas, but the rate of hospitalization is higher among those with a diagnosis. Numerically, this appears as a moderate correlation (r=0.59) between the two rates.

Both statistics are possibly biased by the data source: publicly available Medicaid data from hospitals. While it is possible that the availability of health services from these locations biases the diagnosis and hospitalization rates, it does not appear to fully explain the variance in these statistics across the city.

Asthma and air quality

One determinant of asthma hospitalizations suggested by the community health profile was air quality, specifically fine particulate matter (PM2.5). We looked at PM2.5 density from a dataset from a similar time period with measurements taken from the winter. Winter is when PM2.5 concentration is worst, hence giving the worst-case scenario for asthma triggers. The distribution of PM2.5 density across geographic regions is bimodal, with significantly elevated levels near the Lower Manhattan hotspot.

As the asthma ER visit rate among beneficiaries with asthma is elevated in this area, we cannot rule out the possibility that PM2.5 is a determinant in this area. However, while PM2.5 density is elevated in Mott Haven, as noted by the community health profile, we see that many areas with comparable air quality are not associated with elevated diagnosis or hospitalization rates. Moreover, many of the hotspots - namely, North Brooklyn, the Rockaways, and Staten Island - do not have elevated PM2.5 density. Overall, there is low correlation between PM2.5 density and elevated asthma-related ER visit rates among beneficiaries with asthma (r = .33). These together suggest that only extremely elevated PM2.5 density may be associated with increased risk. Measures of other non-O3 particulates, like benzene and emissions due to boilers, pattern similarly.

Increased PM25 density is not universally associated with higher ER visit rates among beneficiaries with asthma

O3 density is also uneven across the city. As there is data available directly on O3-attributable asthma, there is evidence that even moderately elevated O3 levels lead to a detectable increase in associated asthma hospitalizations.

The rates of O3-attributable asthma-related hospitalization rates out of the whole population for both children and adults are shown above. For both age groups, the rates are clearly elevated for both the South Bronx/Harlem and North Brooklyn hotspots. O3 itself is approximately normally distributed throughout the city, with the geographic distribution also provided below.

Notably, the Lower Manhattan hotspot has relatively low O3 density levels, and hence the O3-attributable asthma hospitalizations are low (despite overall having a higher asthma hospitalization rate). While it is a logical necessity that an area with high O3-attributable asthma hospitalizations must have both (a) a larger population with an asthma diagnosis, and (b) elevated O3 density, it is somewhat surprising that O3 levels do not need to be significantly elevated to detect the effects.

Areas with a low O3 density or low diagnosis rates have low O3-attributable asthma hospitalization rates, as is logically expected. However, we can see that O3 density does not have to be significantly elevated for it to begin elevating O3-attributable hospitalizations in areas with elevated asthma diagnosis rates. Neighborhoods between the red dashed lines are within the middle 50% of O3 density levels, yet even moderately elevated diagnosis rates have significantly higher O3-attributable asthma-related hospitalizations, with little distinction between neighborhoods with similar asthma rates on either side of this middle 50%.

Notably, neighborhoods with comparable asthma diagnosis rates, but which are below the middle 50% of the O3 distribution, have nearly 1/3 of the O3-attributable hospitalizations. Also notable is one outlying neighborhood, with highly elevated O3, an elevated diagnosis rate, but low O3-attributable asthma hospitalization rate. This outlier corresponds to the Rockaways hotspot. It is the only hotspot with notably elevated O3 levels which does not experience elevated hospitalizations attributable to it. Assuming that the attributions are correct, this indicates that O3 may be a significant contributing factor in two out of the three hotspots where it is relevant, with the third deserving further research.

A final, less direct, comparison was made based on zoning. Other research into elevated asthma diagnosis and hospitalization rates in the Bronx has identified industrial zones as possible determinants. While some hotspots, including the one containing Mott Haven, are near industrial zones, a more robust analysis based on GIS and geographic inference is required.

Summarizing air quality

While PM2.5 density was suggested as a possible determinant, it overall has a low correlation with elevated hospitalization rates among beneficiaries with an asthma diagnosis. Many areas with elevated PM2.5 density do not exhibit increased hospitalization rates, and some areas with elevated hospitalization rates do not exhibit increased PM2.5 density. Extremely high PM2.5 density and asthma-related hospitalizations do overlap in the Lower Manhattan hotspot, however. On the other hand, even areas within the middle 50% of O3 density distribution in the city have significantly elevated O3-attributable asthma hospitalizations in areas with elevated asthma diagnosis rates, showing that the affect of O3 is significant even with only moderately elevated O3 levels. This pattern holds for two out of the three hotspots where O3 is prevalent - South Bronx/Harlem and North Brooklyn - with the Rockaways being an exception.

Tobacco use and asthma rates

Another determinant of asthma diagnosis and hospitalization rates suggested by the community health profile was secondhand smoke. While location data from the Community Health Survey (CHS) on secondhand smoke was not available to the public, we can get a proxy for regional smoking habits by looking at the distribution of tobacco retailers. The distribution of tobacco retailers per capita is roughly log-normal. As a very noisy signal, the correlation between log(retailers/capita) and hospitalization rates is very moderate (r = 0.30).

While location data from the CHS was not publicly available, and hence could not be compared with the location-based hospital data, the survey results internally corroborate indoor secondhand smoke as a determinant. Specifically, self-reported asthma diagnoses and smelling secondhand smoke indoors were not independent under an age-balanced ฯ‡2 test (p < 0.0111). Those who responded "Daily", "Weekly", or "Monthly" to the question "How often do you smell cigarettes in your apartment from outside?" were significantly more likely to have been diagnosed with asthma.

While both of these are proxies for the essential statistics, they do support the claims of the community health profile. As exposure to triggers may be more likely to cause symptoms, it may increase the number of people who seek diagnoses, and hence be reflected in the diagnosis rates.

Housing quality and poverty levels

Beyond secondhand smoke, other indicators of housing quality such as rat sightings were also flagged as possible determinants by the community health profile. Again, location data from the CHS was not publicly available, but rat sightings were also not independent of self-reported asthma rates, as determined by an age-balanced ฯ‡2 test (p < 0.0002). Respondents who had seen mice in the area around their building were more likely to have been diagnosed with asthma.

To get a sense of the geographic distribution of housing quality, we looked at the proportion of rat inspections which ended in a result associated with active rat activity. While the available data was sparse, with uneven measurements across geographic areas and time, a map showing the results of such inspections from a similar time period indicate both the South Bronx/Harlem and North Brooklyn hotspots as areas with housing quality issues.

As many areas from the relevant time interval did not have rodent inspection data, reliable statistical tests could not be performed. Additionally, the sampling process is biased, as areas where an inspection has been requested are more likely to have rodent activity. As a result, areas where data is available at all are likely to have highly inflated rates. However, the pattern of available data and rates both suggest that housing quality may be an issue in these two hotspots.

Finally, socioeconomic status is also a potential determinant for many health issues, as it is related to housing quality and other triggers. The probability that a respondent in the CHS reported an asthma diagnosis was significantly dependent on the estimated poverty level of the respondent's neighborhood. Respondents who were estimated to be in "high" or "very high" poverty group areas were significantly more likely to report an asthma diagnosis across the city.

However, it is likely that this determinant is only predictive when taken in context with other factors. While some of the lowest-income areas are within hotspots - specifically, the two hotspots where housing quality may be a major determinant (South Bronx/Harlem and North Brooklyn) - average household income is overall not a strong predictor of asthma diagnoses. There is a negligible negative correlation between income and asthma diagnosis rate (r = -0.09), but so many areas are in the lower part of the citywide income distribution that it is not a strong predictor. Income distribution throughout the city are given below, with the caveat that the available data was from a different time period from the asthma rate data (using data generated from the 2020 Census).

Summary

This research sought to verify, generalize, and rule out certain determinants of asthma diagnosis and hospitalization rates across NYC, building on observations in the Mott Haven community health profile. We showed that these rates are in fact dependent on location, with rates varying significantly between zip codes. Moreover, areas with elevated rates tend to cluster into one of a few hotspots.

However, the determinants associated with these rates are not uniform. While fine particulate matter density has been suggested as a determinant, only extremely high density seems to have an affect on hospitalizations, as in the Lower Manhattan hotspot. Elsewhere, areas with comparable particulate matter density have high variance in their hospitalization rate.

On the other hand, even mildly elevated O3 levels, while not an issue for the Lower Manhattan hotspot, do affect the South Bronx/Harlem and North Brooklyn hotspots, with the Rockaways being the only exception to the pattern. These two hotspots also have housing quality as a likely determinant, possibly related to the income levels in those areas. While asthma rates are likely not independent from the prevalence of secondhand smoke, it is not apparent that tobacco availability is a strong determinant, and public health initiatives should likely focus on other issues, like air and housing quality, instead.

As a public health research project using publicly available data, there were many restrictions on the inferences we could draw definitively. For example, we frequently had to compare data measured over different time periods using different geographic boundaries, as a wide range of sources were used (Medicaid, UHF, air quality, rodent inspection, census, and retail data, to name a few). Time windows were aligned as much as possible, and summary statistics were recomputed to be comparable across various geographic boundaries, using the best information available to make inferences about the geographic region each datapoint belonged to. However, some misalignment was unavoidable.

These misalignments, combined with the fact that the medical data sources were mostly drawn only from Medicaid beneficiaries, could have introduced bias into the analysis. In order to gain a more robust understanding of these determinants, more sophisticated geographic analysis methods could be used, taking into account the (estimated) distances between the geographic origin of various data, or by interpolating data to the requisite timeframes. However, the available data frequently corroborated the existence of certain asthma hotspots in NYC and the relevance of certain determinants, which we believe will help limit the scope of future research and outreach in this area.

About Author

Zach Stone

I am a data scientist with a background in linguistics research and math. I love to make it easier to analyze and draw insights from complex patterns using a combination of research, code, and modeling.
View all posts by Zach Stone >

Leave a Comment

No comments found.

View Posts by Categories

All Posts 2399 posts
AI 7 posts
AI Agent 2 posts
AI-based hotel recommendation 1 posts
AIForGood 1 posts
Alumni 60 posts
Animated Maps 1 posts
APIs 41 posts
Artificial Intelligence 2 posts
Artificial Intelligence 2 posts
AWS 13 posts
Banking 1 posts
Big Data 50 posts
Branch Analysis 1 posts
Capstone 206 posts
Career Education 7 posts
CLIP 1 posts
Community 72 posts
Congestion Zone 1 posts
Content Recommendation 1 posts
Cosine SImilarity 1 posts
Data Analysis 5 posts
Data Engineering 1 posts
Data Engineering 3 posts
Data Science 7 posts
Data Science News and Sharing 73 posts
Data Visualization 324 posts
Events 5 posts
Featured 37 posts
Function calling 1 posts
FutureTech 1 posts
Generative AI 5 posts
Hadoop 13 posts
Image Classification 1 posts
Innovation 2 posts
Kmeans Cluster 1 posts
LLM 6 posts
Machine Learning 364 posts
Marketing 1 posts
Meetup 144 posts
MLOPs 1 posts
Model Deployment 1 posts
Nagamas69 1 posts
NLP 1 posts
OpenAI 5 posts
OpenNYC Data 1 posts
pySpark 1 posts
Python 16 posts
Python 458 posts
Python data analysis 4 posts
Python Shiny 2 posts
R 404 posts
R Data Analysis 1 posts
R Shiny 560 posts
R Visualization 445 posts
RAG 1 posts
RoBERTa 1 posts
semantic rearch 2 posts
Spark 17 posts
SQL 1 posts
Streamlit 2 posts
Student Works 1687 posts
Tableau 12 posts
TensorFlow 3 posts
Traffic 1 posts
User Preference Modeling 1 posts
Vector database 2 posts
Web Scraping 483 posts
wukong138 1 posts

Our Recent Popular Posts

AI 4 AI: ChatGPT Unifies My Blog Posts
by Vinod Chugani
Dec 18, 2022
Meet Your Machine Learning Mentors: Kyle Gallatin
by Vivian Zhang
Nov 4, 2020
NICU Admissions and CCHD: Predicting Based on Data Analysis
by Paul Lee, Aron Berke, Bee Kim, Bettina Meier and Ira Villar
Jan 7, 2020

View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day ChatGPT citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay football gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income industry Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI

NYC Data Science Academy

NYC Data Science Academy teaches data science, trains companies and their employees to better profit from data, excels at big data project consulting, and connects trained Data Scientists to our industry.

NYC Data Science Academy is licensed by New York State Education Department.

Get detailed curriculum information about our
amazing bootcamp!

Please enter a valid email address
Sign up completed. Thank you!

Offerings

  • HOME
  • DATA SCIENCE BOOTCAMP
  • ONLINE DATA SCIENCE BOOTCAMP
  • Professional Development Courses
  • CORPORATE OFFERINGS
  • HIRING PARTNERS
  • About

  • About Us
  • Alumni
  • Blog
  • FAQ
  • Contact Us
  • Refund Policy
  • Join Us
  • SOCIAL MEDIA

    ยฉ 2025 NYC Data Science Academy
    All rights reserved. | Site Map
    Privacy Policy | Terms of Service
    Bootcamp Application