7 Sins in NYC

Chuan Sun
Posted on Jul 21, 2016

When Vito Corleone, the head of the Corleone crime family in the movie “The Godfather”, was shot on the street of New York by hitmen, I was shocked.

I was shocked not just because I was so immersed in the movie, but also due to one sentence echoing in my mind: “no one is an island”.

Uncertainty is everywhere, even for the mafia boss, not to mention millions of ordinary New Yorkers.

Why should we care?

Safety is one of the most fundamental needs for people. As one of the most populous urban agglomerations in the world, New York City is heaven for many, but perhaps hell for few, especially those who were unfortunately affected by the seven “sins”:

7 sins

Each week, NYPD publishes City Wide Crime Statistics, containing detailed weekly statistics of crime complaints on 7 felonies. For example, for the one report during 7/4 to 7/10 of 2016, there were 1888 total crime complaints in NYC: 6 murder, 35 rape, 304 robbery, 444 felony assault, 202 burglary, 765 grand larceny, 132 grand larceny of vehicles.

1888 is not a small number, although the total complaints decreased 5.51% as of 2015. By simple math, we know that there were on average 11.24 felony incidents per hour, or 1 felony incident per 6 minutes in NYC.

Questions

This project investigates 7 sins, a.k.a, felonies, which occurred in NYC in the past 10 years (2006-2015). It focus on answering the following simple yet important questions:

  • (1) Has NYC becoming safer over the last 10 years?
  • (2) Which months in one year can be considered as unsafe?
  • (3) Which days in a week can be considered as unsafe?
  • (4) Which hours in a day can be considered as unsafe?
  • (5) Which boroughs are more unsafe than others?

Source Code

See here for R source code to generate the graphs in this post.

Dataset

The NYPD 7 Major Felony Incidents dataset:

  • Contains Seven Major Felonies that is updated quarterly at the incident level.
  • It was made public at Dec 29, 2015, and is available here.
  • Contains around 1.1 million incidents, 22 variables, and is 194MB in size.
  • Contains approximate location of longitude and latitude across 5 boroughs.
  • Contains timestamps of offense incidents (year, month, hour) spanning from 1919 to 2015.

According to the NYPD Incident Level Data Footnotes:

  • Crime complaints which involve multiple offenses are classified according to the most serious offense
  • For privacy reasons, incidents have been moved to the midpoint of the street segment on which they occur.
  • Attempted crimes are recorded as if the crime actually occurred
  • Data presented here is based on the year the incident was reported, not necessarily when it occurred.

The first point indicates that the number of actual incidents is larger than that in the dataset. Since we know nothing about which types of offenses are typically associated together in incidents of multiple offenses, we can make no assumptions. The second point affects the accuracy of incident locations.  Nevertheless, at the scale of borough or city level, the inaccuracy in longitude and latitude will not have a major impact on the overall distribution of incidents.

Preprocess

Quick exploration using R revealed that, although the years in the dataset span from 1919 to 2015, over 95% of all incidents occurred after 2005. I thus mainly focus on the year from 2006 to 2015. This 10-year period covers 1.1 million incidents.

Visualization and analysis

Trend in the last 10 years (2006 - 2015)

First let us take a look at the overall trend of 7 felonies in NYC in the last 10 years.

trend_in_10_years

Grand larceny is the most frequent offense of all 7 felonies.  The number of incidents is almost twice that of the second most frequent one.

Three felonies are declining: robbery, burglary, and auto theft. I cannot help but link this to the widely used technology in camera surveillance. Wrongdoers know their big faces will instantly show up in NYPD screens once they risk themselves.

Murder and rape have stayed at the same level across 10 years.

The number of felony assaults is on a slightly increasing trend.

To sum up, it is safe to conclude that NYC is getting safer.

Incidents by month

NYC’s seasons are defined as follows:

  • Spring season: March, April, May
  • Summer season: June, July, August
  • Fall season: September, October, November
  • Winter season: December, January, February

incidents_by_month

Late winter and early spring tend to have the smallest number of incidents for almost all 7 felonies, with February having a particularly low felony incidence.  These can be considered as the safest seasons. This is understandable. During those months it can become very chilly, windy, and snowy. Who would want to go out in such weather?

Summer and early fall tend to have the largest number of incidents for almost all 7 felonies. Summer months in NYC are usually hot and humid, and temperatures may remain high at night.  This can make certain people ornery.

Incidents by day of week

incidents_by_day_of_week

Friday is the least safe day in the week. This insight is easily perceived from the histograms. On Friday, burglary, grand larceny, larceny of motor vehicle, and robbery occur more frequently than on other days. Maybe, people tend to feel very relaxed on Friday after one week’s work, perhaps therefore not being as vigilant as they otherwise might be. This could give wrongdoers great opportunities to break into houses, steal property, such as cars, or commit robberies on the streets.

As for the weekend, the number of incidents for burglary, grand larceny, auto theft, and robbery declines. If you think that people are at home playing with their kids, enjoying family time, watching favorite TV shows, or preparing for their next week’s work, then maybe there is less of an opportunity for wrongdoers to sneak into their homes.  

On the other hand, weekend is less safe in terms of felony assault, rape and murder. Home violences, bad family relationships and unkindly words, may all related to an unhappy or disastrous weekend. So maybe family time is not equally great for everyone!

Incidents by hour

Knowing which hours are safe or unsafe for certain offenses is vital for New Yorkers, since hour is a “tangible” and controllable unit. One can choose to be at one place at a certain hour, or not.

incidents_by_hour

It strikes me that, even with just simple density and histogram graphs, without any complex machine learning models, we can still distill many insights from history.

  • Burglary happens most often during the morning and late afternoon. This approximates to one hour after New Yorkers leave for work, and one hour before they return home.  It makes sense that burglaries are done when people are not at home.
  • On the other hand, felony assaults occur most often during the evening and midnight hours. Does the nighttime bring out the worst in us?
  • Then again, grand larceny occurs most often at noon, early afternoon, and afternoon.
  • Larceny of motor vehicle occurs most often during the midnight hours. The dark night is a silent but perfect conspirator.
  • Rape also occurs most often during the midnight hours and least often in the morning. People are vulnerable at night, especially when asleep.
  • Robbery occurs most often in the afternoon, spikes at 3pm, and is least often in early morning.
  • Murder occurs most often during the midnight hours and least often at 8am. Again, wrongdoers take advantage of victim’s lack of vigilance at night.

Clock view

It is easy to see on a clock when each of the deadly sins peak in terms of frequency.  You can almost map the life of a felon, and only  few hours in a day are really safe, e.g., 5am is a safe time to be alive.

offenses_in_one_clock

Incidents by borough in 2015

We should also keep an eye on where felonies occur.  

incidents_by_borough

From the histogram above, it can be seen that Manhattan has the most number of grand larcenies. This is somehow not surprising. Perhaps Wall Street and most financial companies are located there, and wrongdoers can get their hands dirty easily. Brooklyn is the second runner, and Staten island has the least. Despite Manhattan being the winner when it comes to grand larceny, Brooklyn in fact appears to be the most dangerous borough.  It ranks first on the of incidents for 6 out of 7 felonies.  In contrast, Staten Island ranks last.

How does 7 sins distribute in 5 boroughs in 2015?

The density map below depicts a visualization of crime in all 5 boroughs. It turns out that each borough has its own distinct pattern of hot locations.

Offenses in Manhattan in 2015

offenses_map_in_manhattan_2015

Offenses in Queens in 2015

offenses_map_in_queens_2015

Offenses in Bronx in 2015

offenses_map_in_bronx_2015

Offenses in Brooklyn in 2015

offenses_map_in_brooklyn_2015

Offenses in Staten Island in 2015

offenses_map_in_staten_island_2015

Conclusion

New Yorkers may rely solely on NYPD to solve those problems. But if each New Yorker is aware of the time/space patterns identified in this report, s/he can take proper action and things may be different.  

Future work

NYC is getting safer and safer. But we should not be satisfied with this. Eradicating felonies is a long-term mission.  I believe more work can be done, including but not limited to:

  • Investigate how the density map has evolved over the past 10 years. Hotspots might dilute, merge, shrink, inflate, etc. If such patterns can be extracted, more valuable insights might be disclosed.
  • Investigate incidents on a finer-grained level, such as block or street level, and generate dynamics of how other factors such as economy, average income, employment rate, etc affect the felonies.

About Author

Chuan Sun

Chuan Sun

Chuan is interested in uncovering the relationship of things. He likes to seek order from chaos. Previously, he worked on a unannounced project in Amazon Seattle as a software engineer. The project is related to machine learning and...
View all posts by Chuan Sun >

Leave a Comment

Avatar
Google September 29, 2019
Google One of our guests not long ago encouraged the following website.
Avatar
Google September 18, 2019
Google That is the finish of this post. Right here you will discover some web sites that we feel you’ll enjoy, just click the links.
Avatar
Terri February 12, 2017
You can certainly see your skills in the article you write. The world hopes for even more passionate writers like you who are not afraid to mention how they believe. At all times follow your heart.
Avatar
Poe4orbs January 5, 2017
I would like to try! Poe4orbs http://linkis.com/www.poe4orbs.com/9z01d
Avatar
Kleyn October 11, 2016
Shouldn´t the difference between day's crime rates be validated with T-test in order to check statistical significance?

View Posts by Categories


Our Recent Popular Posts


View Posts by Tags

2019 airbnb Alex Baransky alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus API Application artist aws beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep Bundles California Cancer Research capstone Career Career Day citibike clustering Coding Course Demo Course Report D3.js data Data Analyst data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization Deep Learning Demo Classes Demo Day Demo Lesson Discount dplyr employer networking feature engineering Finance Financial Data Science Flask gbm Get Hired ggplot2 googleVis Hadoop higgs boson Hiring hiring partner events Hiring Partners Industry Experts Instructor Blog Instructor Interview Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter lasso regression Lead Data Scienctist Lead Data Scientist leaflet Lectures linear regression Live Chat Live Online Bootcamp Logistic Regression machine learning Maps matplotlib Medical Research Meet the team meetup Networking neural network Neural networks New Courses nlp NYC NYC Data Science nyc data science academy NYC Open Data NYCDSA NYCDSA Alumni Online Online Bootcamp Online Lectures Online Training Open Data painter pandas Part-time Portfolio Development prediction Prework Programming PwC python python machine learning python scrapy python web scraping python webscraping Python Workshop R R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking Realtime Interaction recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn Selenium sentiment analysis Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau team TensorFlow Testimonial tf-idf Top Data Science Bootcamp twitter visualization web scraping Weekend Course What to expect word cloud word2vec XGBoost yelp