Data Visualization on NYC Motor Vehicle Collision
The skills the author demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.
As a New Yorker, I walk to school and work and so have wondered about the safety of the streets I must go through. We hear about drivers plowing into pedestrians. In May 2017 this happened at Times Square, and it resulted in the death of a young girl. Since then I have constant fear in my mind that what if someone rams car into the walkway? So with this fear on my mind, I decided to data visualize the NYC Motor Vehicle Collision dataset to determine some significant insights.
In New York, approximately 4,000 New Yorkers are seriously injured and more than 250 are killed each year in traffic crashes. Being struck by a vehicle is the leading cause of injury-related death for children under 14, and the second leading cause for seniors. On average, vehicles seriously injure or kill a New Yorker every two hours.
This status quo is unacceptable. The City of New York must no longer regard traffic crashes as mere "accidents," but rather as preventable incidents that can be systematically addressed. No level of fatality on city streets is inevitable or acceptable. This Vision Zero Action Plan is the City's foundation for ending traffic deaths and injuries on our streets.
Vision Zero is a program created by New York City Mayor Bill de Blasio in 2014. Its purpose is to cut the number traffic fatalities in half by 2025. On January 15, 2014, Mayor de Blasio announced the launch of Vision Zero in New York City, based on a similar program of the same name that was implemented in Sweden. The original Swedish theory hypothesized that pedestrian deaths are not as much "accidents" as they are a failure of street design.
It’s been five years since New York City signed the strongest open data law in the country and then launched an open data website called NYC Open Data. The site includes data for NYC Motor Vehicle Collision from 2012 to 2017 that is updated every month. Currently, the dataset has 1.08 million rows and 29 columns. More information about the dataset can be found here.
The dataset provided is clean but it has missing values. So we have to remove all the NAs first in order to visualize the data. Each observation (row) in dataset represents one accident. The date column has no missing values. As the date is in character format we have to convert it to date format in order to extract day, month and year from given date. Before removing NAs, we can visualize how many collisions occurs in each year.
The first visual is the line chart of Number of Accidents vs Year shows change in accidents from 2012 to 2017. We can clearly see that after vision zero initiative in 2104, the number of accidents are increasing and reaches its peak point in 2016. There is a huge drop in number of accidents from the beginning of 2017 until the present.
The next chart shows the number of collision in each year by borough. It reveals that Brooklyn has highest number of collisions in each year, and Manhattan ranks second, closely followed by Queens.
How about which day of week has highest number of collision?
Friday! For each year, Friday has the highest number of collision. We can assume that people are keen to go home after finally finishing the work week.
According to above heat map of Hour of day as a function of Borough, in Brooklyn around 4 pm to 5 pm has the maximum number of collisions. Manhattan’s highest number of collisions occur from 2 pm to 4 pm. Queens has the most collisions at 8 am, as well as at 4 pm to 5 pm.
In the maps above, the heat map(left) shows the highest number of collision at almost all the avenues. It also shows the maximum number at the approach to Chinatown. Williamsburg Bridge. The cluster map (right) show that the , Lower East Side has maximum number of collision(1225) followed by Midtown(1060)m then comes Chelsea with 971 collisions.
Let's visualize how many pedestrians got injured?
In the above Bar plot(left) of pedestrians injured per year, Brooklyn ranks the highest, followed followed by Manhattan. For Manhattan the number of pedestrians injured gradually decreasing over a year. But in Brooklyn we see about the same number of injured pedestrians in both year 2015 and 2016. In heat map(right), 42nd Street and 8th Avenue shows the maximum number of pedestrians injured. In second place is 14th Street, which is a major cross street. Canal Street also shows more injured pedestrians than in other places.
The number of pedestrians and number of collision in Manhattan overall are gradually decreasing, but the next line chart is quite shocking!
The line plot above shows the ratio of injured pedestrians to the total number of accidents per year., Surprisingly Manhattan has huge spike from 2016 to 2017. Even though the number of accidents are decreasing in 2017. The number of injured pedestrians is not. In 2016 Manhattan had 227,763 collisions and 2085 injured pedestrians. In 2017 Manhattan had 115,499 collisions, which is less than half the amount of 2016, and the number of injured pedestrians were 1046. That's shocking!
What are the contributing factors to pedestrian injury?
Some major contributors to injury are identified in the heat map above. The major factor is driver inattention and failure to yield right of way. Other major factor indicated by a yellow hue include backing unsafely and even pedestrian/cyclist/other pedestrians error/confusion.
The number of collisions are decreasing through the year, but the number of pedestrians getting injured is not decreasing. Millions of tourist came to visit New York City every year, and everyday millions of New Yorker use a walkway as daily route to home or work. Government has to take an extra step for the safety of pedestrians. Government should reduce the speed limit, introduce slow zones and increased enforcement. For pedestrian confusion government has to make clear signs, more stop lights for pedestrians and increase pedestrians crossing time. Government have to make walkway with metal safety poles because now a days anybody would ram the car into walkway or plow into pedestrians.