Data Study on NYC Fire Incident and False Alarm

Posted on Feb 4, 2019
The skills the author demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.


If you live in New York City, as I do, you have likely noticed that the sound of sirens cuts through the background city noises round the clock. These sirens made me wonder, how many are real fire incidents, and how many are  false alarms? To determine the answer, I created an app using R Shiny to show the data breakdown and enable people to easily read and understand what accounts for so many sirens in the city.

You can see my project and code links below:


Two of the datasets I used were from  NYC OpenData. In the fire response data, it records every incident responded by NYC firehouse from 2013 to 2017. Inside the dataset, there is a section for incident code, which distinguishes each type of incident. I grouped incidents by fire incidents, false alarm, and other incidents.

Data Analysis

Data Study on NYC Fire Incident and False Alarm

In this section of the project, I observed the number of false alarm, fire incident and other incidents, and find the highest incident for each New York City Borough.

5 Boroughs

The column graph above shows out of the five boroughs,  Brooklyn had the highest rate of fire incidents in the years between 2013 and 2017. They amounted to 55,672. One of the reasons Brooklyn had the highest rate is because Brooklyn has the highest population and the highest number of housing unit in New York City.

Data Study on NYC Fire Incident and False Alarm

This graph shows Manhattan had the highest rate of the false alarms-- 115,842 incidents between 2013 and 2017.. Manhattan has the highest population density in New York City with 72,033 per person per square mile. Human or fire system failure may cause a false alarm. Human failure means there may be an eyewitness  report that mistook something for a fire or an accidental activation of a fire alarm.

Frequency in a Day

Data Study on NYC Fire Incident and False Alarm

This graph point out the frequency false alarm and fire incident occur in a 24 hour period. The highest false alarm rate occurs between 1pm and 5pm, and the lowest range happened from 1am to 6 am. This is because people are less active at night and more likely to be sleeping than cooking or actively using  electricity. On the other hand, fire incident increased from lunch to dinner time. These period have the highest fire incident rate due to the need for fire or heat for cooking. Still most events turn out to be false alarms as The graph below shows.

Frequency in a Month

Data Study on NYC Fire Incident and False Alarm

We’ve established that time of day plays a role in fire alarm incidents. the question is: what about time of year? This monthly frequency graph shows the highest range of false alarm happened from June to October, which would be the warmest time of year. On the other hand, the highest range of actual fire incidents occurred from October to April, which is the coldest time of year. Due to the cold weather, people massively use heaters to increase the indoor temperature. Thus, some old building with old wire might contribute to fires caused by the overuse electricity or shorts in electric heaters.


As the graphs above showed, we can distinguish the fire incident and false alarm happened in each borough by population and population density. We can also see the role time plays, both in terms of season and time of day.

Future Work

The analysis would be more thorough if it also offers a visualization of  geographic analysis.Using zip code or borough to segment fire incident or false alarm in the map by a different color can make the project has more information about the incident based on geographic location. I would use other coding languages like Java or Python to create these graphs and add it into the project.

About Author

Hong Yang (Jason) Wang

Certified Data Scientist skilled in cleaning, visualizing and interpreting data with machine learning and statistical analysis in R and Python. Team player who performs well under pressure, receptive to feedback, pays meticulous attention to detail and always keeps...
View all posts by Hong Yang (Jason) Wang >

Related Articles

Leave a Comment

No comments found.

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI