Shiny to Crime Forecasting Challenge by National Institute of Justice
I was reading the news and I came across the article above, 70 year old women in Portland was raped in the daylight and the rapist went back to mowing the lawn after committing such heinous crime.
Thomas, in and out of jail from 2008 to 2013 was on parole with minimum surveillance as per the judge. How is this surveillance level decided? By asking a group of 100+ questions, based on which many judicial system in United States assess further potential risk to the society from a criminal.
Thomas was out as low risk suspect when he committed the rape, the reason why the analytical system proved wrong is simple, Thomas lied about his age 19 in the questionnaire when he was 50 which reduced his risk to commit another crime to low. I wanted to see whether there are more such patterns in crime committed by same person also I wanted to know whether this analytical system works or is just leading us wrong way.
I began gathering data, my first source was www.oregonstate.edu , I also wanted to create a method to visualize the crimes in Portland and use it further to analysis on which areas can be under heavier patrol employed by law.
My data set finally looks like:
The one thing that I am missing is co-ordinates, the found out that police department uses 7 digit and 6 digit UTF co-ordinate system in Portland. In Google Earth toolkit menu you can notice there is a function which can convert lat and long to Northing and Easting, I resolved this issue by changing co-ordinates to latitude and longitudes.
What happened next was more painful, if you have been to Portland or seen it on the map, it looks like how my plot came out, there is a river from the middle and a H shape highway lane passes through the city represented in grey color .
The dots are all the crime recorded in 2012 alone, when I began plotting it on Earth, the points started to appear in Alberta,Canada than in Portland, USA. I believe the shift must have been given on purpose in the data set due to privacy concern, I tried several fix, initially I thought it was a Zone shift, that hypothesis failed and then I tried several formulas when finally Euclidean Principal showed positive result, I reduced each lat and long by the recorded distance between Alberta and Portlanad. Result:
The aim was to create an application where everyone can see and visualize the areas affected more or less by crime, so I used R Shiny as a tool to complete a Shiny app that demonstrate the crime, and it is much more interactive than what I had achieved so far.
The application allows you to move anywhere on the map, the control panel allows you to visualization crimes between specific dates. Also you can choose what type of crime would you like to Visualize, a link to the app is given here .
By visualizing the crimes by date I could see the repetition of crime and its patterns over the years in same location.
I found Portland has a total Police force of 1000 active on duty-cops and 200 reserve, also it has 300 civilian agents. According to census bureau the population of Portland is 609,892.
So that makes 1 cop for 510 people. Where in NYC there is 1 cop for 58 people another reason I found for the failing system money, the reason why the prisons can not keep criminal on punishment for long are not just based on there seriousness of crime alone but also based on the amount of money spent to keep them in detention.
On average $69 is spent on a single prisoner every day, so the cost of keeping someone in prison is very high, this accounts for $59 Billion annually for all the states combined.
The failure of current analytical system has led to a rise in crime over the years too.
This is crime in 2012 April, the number of Accident were about 1000 a month and Burglary cases were about 1800, more serious crime as shooting and stabbing cold were as low as 10 to 20.
The crime in Aug 2016
Total rise can now be noticed clearly, what caught my attention was rise in Burglary and Shooting, the number of shooting cases almost doubled to 50 and the Burglary cases are 2300 a month from 1800 in April 2012.
There are lot of seasonal pattern in the data, as overall crime is highest in July every year, and lowest in Dec and Jan, one of the reason is weather, due to cold weather, crime does slow down. Yet the number of shooting cases are maximum in Oct for some reason. and Burglaries are highest in July and August. Overall things in Portland get worse in the second half of every year.
The analysis here left some conclusions, yet there is no change in Crime in Portland today, I decided to take a stand and further found a way to contribute to this issue. NIJ launched a Crime Challenge. The goal of the challenge are given below.
- Encourage "nontraditional" crime forecasting researchers to compete against more "traditional" crime forecasting researchers.
- Compare available crime forecasting methods.
- Improve place-based crime forecasting.
My next step is to Use kNN algorithm to train my data set and build a predictive model which can help the police department to utilize there resources heavily in regions which would more likely to be active crime regions for that day. The efficiency and accuracy of algorithm could be upto a period of 3 months until the pattern in crime begins to change again and then we would require a new data set for training a new model.