Data Visualization on Crimes in New York City
The skills the author demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.
Background & Purpose
Just 20 years ago, data shows the streets of New York were racked with all kinds of crime, from murders, drug deals, to grand larceny, petite thefts. Since the late 90s, the city has seen an encouraging trend in steadily-declining crime rates. However, it goes without saying that even today, public safety is one of the top concerns of the city’s more than eight million residents, as well as for the hundreds of thousands of new comers or visitors annually who do not know the city's neighborhoods as well as the locals do.
So I aim to provide an intuitive and easy to use crime data visualization tool built from the extensive police report database, which is available on the NYC Opendata website.
Currently, the dataset contains close to six million crime records collected from a period of more than 10 years. New data is being added annually. For illustration purpose and practical reasons I decided to take a random subset of 100,000 observations from the dataset, which is more in line with the scope of this project.
Without further a due, let me introduce the actual functionalities of my app. First, we have the overview tab, which contains five sub-tabs: Crime by Type, Crime by Month, Crime by Hour, Crime by Borough and Crime by Premises. Each of these tabs are structured in a very similar fashion: users can choose to browse through yearly data using the filter on the top and the bar chart below will display the total crime counts by each group/category.
These bar charts are designed to give users very basic and yet solid understanding of the crime distributions in NYC. Sometimes trends can be spotted more easily on charts as simple as these! For example, if we take a look at Crime by Month:
Turns out when the weather is cold during winter/early spring months, criminals take a break too. Summer times usually observe higher crime counts.
Surprise surprise, residential crimes make up about 40% of all crimes in the city. Your home may not be as safe as you thought it was!
The interactive map is a great function for users who are after every bit of details on crimes the happened in the city in the last ten years. They can filter through crime types, premises and time periods to focus on a subset of the data they are interested in. also Users can use the "Cluster" options to group nearby crimes in order to avoid making the map too crowded. After zooming in, users can also click on each individual crime spot to get more detailed info on the crime. The optional "Point" layer may help user select individual crime records more easily.
Similar to the main interactive map, users of the heatmap can also use the filters on the panel to focus on a subset of the data they are most interested in. But unlike the interactive map, the heatmap won't plot all the instances of crime from the database, but rather show the users the density of crime occurrences across different neighborhoods in NYC. This function gives the users greater flexibility and a clearer view on the big picture.
Crime Rates for the Five Boroughs Data
Next I decided to do some visualizations on crime rates for the past 10 years across the five boroughs. Users can select which boroughs to plot though the filters on top. And once the data is fed into the plotly chart, users can temporarily select or deselect by clicking on the legends for easier comparisons. By looking at crime rates data for Manhattan, Brooklyn and Queens, some may be surprise that Manhattan’s crime rate is comparable to Bronx, both substantially higher than Queens and Brooklyn. This is why data visualization is important, it may correct some of your long time misconceptions.