Data Visualizing Traffic Violations in Montgomery County

Posted on Aug 8, 2016
The skills the author demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.

Introduction

If you could comprehensively view at all traffic violations for your home county, what information or data would you want to know?   Consider the experiences of the following hypothetical Marylandian:

Data/ Scenario

Mr. Smith lives in Clarksburg, Maryland, and works at the Lakeforest Mall--about 15 minutes down the Washington National Pike.  He commutes to work every day during commuter hours.  Mr. Smith is generally a very patient individual, but recently he's been sleeping poorly, waking up late for work, and having to rush to make it on time.  He has picked up several speeding tickets in the past month, and he's racking up points on his drivers license.  This cannot continue.

Bored and wide awake at 2 am,  he stumbles upon an app that uses Data.gov's publicly available dataset on traffic violations in Montgomery County Maryland to plot all electronic violations in the county for the months of June and July, 2016.

The app features an interactive map zoomed in on Montgomery County, MD, with all traffic violations plotted as blue dots. The sidebar features filtering criteria to allow the viewer to sort and visualize the data by his/her own interest.

The app features an interactive map zoomed in on Montgomery County, MD, with all traffic violations plotted as blue dots. The sidebar features filtering criteria to allow the viewer to sort and visualize the data by his/her own interest.  Above the map, A scrollbar exists to add red coloring to certain types of violations.  Lastly, the number of violations and number of red violations are displayed above the map as well (the red violations count defaults to 1, as a 0-length dataset produces an error).      

Mr. Smith thinks to himself, "hmm... I'm probably in this."  He then decides to see what he can make of the display.  At the bottom of the sidebar, he notices the "Speeding" scroll menu and decides to visualize just the speeding tickets.  He adjusts the hour range so that he's looking at speeding tickets during morning rush hour.  He finds the Lakeforest Mall, and immediately feels some sympathy--look at how many speeding tickets are given out around the Lakeforest mall in the morning (compared to the surrounding area)! 

Lakeforest Mall

The Lakeforest Mall is the centered pink and gray area with a high concentration of dots.

The Lakeforest Mall is the centered pink and gray area surrounded by traffic violations.

Obervations

But still, Mr. Smith knows that other people sharing in his experience does not make the points on his license any fewer. However, he does notice interesting patterns involving speeding tickets on the main two roads (the Eisenhower Memorial Parkway and Frederick Road) from Clarksburg (where he lives) to the Lakeforest Mall (where he works).  He decides to look at just weekdays so he knows he's looking only at information that pertains to his commute.   

Eisenhower Memorial Highway (red) has speeding violations in several clusters spread out along its length, whereas Frederick Road seems to have its violations concentrated towards the bottom.

Eisenhower Memorial Highway (red) has speeding violations in several clusters spread out along its length, whereas Frederick Road (yellow) seems to have its violations concentrated towards the bottom. (Lakeforest mall is just beyond the bottom of the screen).

Mr. Smith notices that until late in his drive, there are far more speeding violations given out on Eisenhower Memorial Highway than on Frederick road.  He also notices that speeding tickets on these two roads are clustered together.  On Eisenhower, they seem to be around intersections and highway merges, and on Frederick, they seem to be in one spot near the bottom of the screen (probably a speed trap there!).  Mr. Smith feels much more confident with his new knowledge regarding which roads are high risk for speeders--and more importantly, where on which roads he needs to pump the brakes.

~~~~~~~~~~~~~~~~~

Another Scenario

Ok, so though we've made Mr. Smith a more informed speeder, its not clear we've done anything good for the world.  However, consider Mr. Smith's friend, Mr. White.

Mr. White is Mr. Smith's college roommate.  Mr. White just got a new job and is moving to Montgomery County, Maryland.  He has kids who are nearing driving age, and he is concerned about driver safety in his neighborhood.  He wants to gather more information before deciding where to buy a house.  Mr. Smith shows Mr. White the new app he found.

After examining the distribution of all violations in the county, Mr. White decides that he is particularly concerned about drunk driving--he doesn't want his kids learning to drive on roads with intoxicated drivers.  He decides to color all the traffic violations involving alcohol red so he can visualize the distribution of alcohol related incidents in comparison with the general distribution of traffic violations.

Alcohol-related traffic violations in red

Alcohol-related traffic violations in red.

 Drunk Driving

He is thankful to find that, though it might be hard to avoid violations in general, alcohol-related violations are clearly clustered around particular points--if he avoids those neighborhoods, he can likely reduce the amount of time his kids spend on the road with intoxicated drivers.  

Mr. White wants to do better, however.  He knows that his kids are in school on weekdays and will do most of their driving on weekends, so he decides to look at the same distribution for Friday-Sunday.  He sees the same clusters just as pronounced and feels very confident that he knows where most of the drunk driving takes place in his future home.  

Screen Shot 2016-08-07 at 6.51.21 PM

Comparing Weekends to Weekdays

Also, he noticed that the red violation count doesn't decrease as much as he expected when he cut out over half of the week.  He decides to compare his weekend visualization to the same for Monday-Thursday.

Though more violations occur Monday-Thursday, more alcohol related violations occur Friday-Sunday both proportionally and absolutely.

Though more violations occur Monday-Thursday (4 days), more alcohol related violations occur Friday-Sunday (3 days) both proportionally and absolutely.

He notices that even though Monday-Thursday is a larger stretch by a day, Friday-Sunday has substantially more alcohol-related violations.  He becomes concerned that his children will be doing their driving when the most intoxicated drivers are on the road!  To figure out what to do about this, he decides to investigate further: exactly when on the weekends are the drunk drivers out and about?  

Midnight to 7:00 (end of hour 6) accounts for

Midnight to 7:00am (end of hour 6) accounts for 382 of the 522 (73%) alcohol related violations to occur on weekends during June and July, 2016 in Montgomery County, Maryland.  11pm-midnight accounts for another 43, or 8%.

Adjusting the hour filter, Mr. White figured out that 81% of weekend drunk driving violations in the county occur after 11pm.  Mr. White just discovered his children's future curfew! 

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Data Observations

Hopefully it's clear how plotting traffic violations and allowing users to filter, color, and zoom can be informative about local areas, as well general time and area distributions.  Notably, alcohol-related violations did not follow the general distribution of traffic violations in the county but rather were concentrated around hot-spots.  Speeding violations, however, did:

Speeding violations more closely resemble the layout of violations in general than do alcohol-related incidents.

Speeding violations more closely resemble the layout of violations in general than do alcohol-related incidents.

The app also includes filterable inputs for gender and race.  However, these inputs are far better for comparing other variables within a gender or race category than for comparing across multiple gender or race categories, as doing so is prone to confounders like population density and representation on the road.

Lastly, I included the day of the month input mostly just to test the commonly held belief that police give out more violations at the end of the month to fill their quotas.  Quickly comparing violation counts for days at the end of the month and days at the beginning, I came to no such conclusion!

About Author

William Bartlett

Will Bartlett is a History of Science and Medicine Major from Yale University who recently took a leave of absence from medical school to explore data science. As an undergraduate, he studied the role of data in medicine...
View all posts by William Bartlett >

Leave a Comment

Caroline Fernandes April 30, 2019
I liked the way you used these visualizations to explain regular activities.The story told by these graphs is quite interesting!

View Posts by Categories


Our Recent Popular Posts


View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI