NYC Data Science Academy| Blog
Bootcamps
Lifetime Job Support Available Financing Available
Bootcamps
Data Science with Machine Learning Flagship ๐Ÿ† Data Analytics Bootcamp Artificial Intelligence Bootcamp New Release ๐ŸŽ‰
Free Lesson
Intro to Data Science New Release ๐ŸŽ‰
Find Inspiration
Find Alumni with Similar Background
Job Outlook
Occupational Outlook Graduate Outcomes Must See ๐Ÿ”ฅ
Alumni
Success Stories Testimonials Alumni Directory Alumni Exclusive Study Program
Courses
View Bundled Courses
Financing Available
Bootcamp Prep Popular ๐Ÿ”ฅ Data Science Mastery Data Science Launchpad with Python View AI Courses Generative AI for Everyone New ๐ŸŽ‰ Generative AI for Finance New ๐ŸŽ‰ Generative AI for Marketing New ๐ŸŽ‰
Bundle Up
Learn More and Save More
Combination of data science courses.
View Data Science Courses
Beginner
Introductory Python
Intermediate
Data Science Python: Data Analysis and Visualization Popular ๐Ÿ”ฅ Data Science R: Data Analysis and Visualization
Advanced
Data Science Python: Machine Learning Popular ๐Ÿ”ฅ Data Science R: Machine Learning Designing and Implementing Production MLOps New ๐ŸŽ‰ Natural Language Processing for Production (NLP) New ๐ŸŽ‰
Find Inspiration
Get Course Recommendation Must Try ๐Ÿ’Ž An Ultimate Guide to Become a Data Scientist
For Companies
For Companies
Corporate Offerings Hiring Partners Candidate Portfolio Hire Our Graduates
Students Work
Students Work
All Posts Capstone Data Visualization Machine Learning Python Projects R Projects
Tutorials
About
About
About Us Accreditation Contact Us Join Us FAQ Webinars Subscription An Ultimate Guide to
Become a Data Scientist
    Login
NYC Data Science Acedemy
Bootcamps
Courses
Students Work
About
Bootcamps
Bootcamps
Data Science with Machine Learning Flagship
Data Analytics Bootcamp
Artificial Intelligence Bootcamp New Release ๐ŸŽ‰
Free Lessons
Intro to Data Science New Release ๐ŸŽ‰
Find Inspiration
Find Alumni with Similar Background
Job Outlook
Occupational Outlook
Graduate Outcomes Must See ๐Ÿ”ฅ
Alumni
Success Stories
Testimonials
Alumni Directory
Alumni Exclusive Study Program
Courses
Bundles
financing available
View All Bundles
Bootcamp Prep
Data Science Mastery
Data Science Launchpad with Python NEW!
View AI Courses
Generative AI for Everyone
Generative AI for Finance
Generative AI for Marketing
View Data Science Courses
View All Professional Development Courses
Beginner
Introductory Python
Intermediate
Python: Data Analysis and Visualization
R: Data Analysis and Visualization
Advanced
Python: Machine Learning
R: Machine Learning
Designing and Implementing Production MLOps
Natural Language Processing for Production (NLP)
For Companies
Corporate Offerings
Hiring Partners
Candidate Portfolio
Hire Our Graduates
Students Work
All Posts
Capstone
Data Visualization
Machine Learning
Python Projects
R Projects
About
Accreditation
About Us
Contact Us
Join Us
FAQ
Webinars
Subscription
An Ultimate Guide to Become a Data Scientist
Tutorials
Data Analytics
  • Learn Pandas
  • Learn NumPy
  • Learn SciPy
  • Learn Matplotlib
Machine Learning
  • Boosting
  • Random Forest
  • Linear Regression
  • Decision Tree
  • PCA
Interview by Companies
  • JPMC
  • Google
  • Facebook
Artificial Intelligence
  • Learn Generative AI
  • Learn ChatGPT-3.5
  • Learn ChatGPT-4
  • Learn Google Bard
Coding
  • Learn Python
  • Learn SQL
  • Learn MySQL
  • Learn NoSQL
  • Learn PySpark
  • Learn PyTorch
Interview Questions
  • Python Hard
  • R Easy
  • R Hard
  • SQL Easy
  • SQL Hard
  • Python Easy
Data Science Blog > Student Works > Shiny to Crime Forecasting Challenge by National Institute of Justice

Shiny to Crime Forecasting Challenge by National Institute of Justice

Arjun Singh Yadav
Posted on Feb 6, 2017

portls

I was reading the news and I came across the article above, 70 year old women in Portland was raped in the daylight and the rapist went back to mowing the lawn after committing such heinous crime.

Thomas, in and out of jail from 2008 to 2013 was on parole with minimum surveillance as per the judge. How is this surveillance level decided? By asking a group of 100+ questions, based on which many judicial system in United States assess further potential risk to the society from a criminal.

Thomas was out as low risk suspect when he committed the rape, the reason why the analytical system proved wrong is simple, Thomas lied about his age 19 in the questionnaire when he was 50 which reduced his risk to commit another crime to low. I wanted to see whether there are more such patterns in crime committed by same person also I wanted to know whether this analytical system works or is just leading us wrong way.

I began gathering data, my first source was www.oregonstate.edu , I also wanted to create a method to visualize the crimes in Portland and use it further to analysis on which areas can be under heavier patrol employed by law.

My data set finally looks like:

dataset

The one thing that I am missing is co-ordinates, the found out that police department uses 7 digit and 6 digit UTF co-ordinate system in Portland. In Google Earth toolkit menu you can notice there is a function which can convert lat and long to Northing and Easting, I resolved this issue by changing co-ordinates to latitude and longitudes.

What happened next was more painful, if you have been to Portland or seen it on the map, it looks like how my plot came out, there is a river from the middle and a H shape highway lane passes through the city represented in grey color .Sketch

The dots are all the crime recorded in 2012 alone, when I began plotting it on Earth, the points started to appear in Alberta,Canada than in Portland, USA. I believe the shift must have been given on purpose in the data set due to privacy concern, I tried several fix, initially I thought it was a Zone shift, that hypothesis failed and then I tried several formulas when finally Euclidean Principal showed positive result, I reduced each lat and long by the recorded distance between Alberta and Portlanad. Result:

corrections

The aim was to create an application where everyone can see and visualize the areas affected more or less by crime, so I used R Shiny as a tool to complete a Shiny app that demonstrate the crime, and it is much more interactive than what I had achieved so far.

map1

The application allows you to move anywhere on the map, the control panel allows you to visualization crimes between specific dates. Also you can choose what type of crime would you like to Visualize, a link to the app is given here .

By visualizing the crimes by date I could see the repetition of crime and its patterns over the years in same location.

map2

map3I could not show all pictures by when you check the app the dates move all the way to November 2012 on the same spot, this shows patterns of similar crime and location. There are thousands of clusters in Portland such as this. After getting my first result I moved to analyse further reasons for the ineffectiveness of cops to detect such patterns.

I found Portland has a total Police force of 1000 active on duty-cops and 200 reserve, also it has 300 civilian agents. According to census bureau the population of Portland is 609,892.

So that makes 1 cop for 510 people. Where in NYC there is 1 cop for 58 people another reason I found for the failing system money, the reason why the prisons can not keep criminal on punishment for long are not just based on there seriousness of crime alone but also based on the amount of money spent to keep them in detention.

On average $69 is spent on a single prisoner every day, so the cost of keeping someone in prison is very high, this accounts for $59 Billion annually for all the states combined.

The failure of current analytical system has led to a rise in crime over the years too.Crime Rate The highest crime types are mostly Accidents and Burglary which are not that serious compared to crimes in other states, but if people will not change there mindset about this situation then the numbers will kept growing.

2012mar

This is crime in 2012 April, the number of Accident were about 1000 a month and Burglary cases were about 1800, more serious crime as shooting and stabbing cold were as low as 10 to 20.

The crime in Aug 2016

2016sep

Total rise can now be noticed clearly, what caught my attention was rise in Burglary and Shooting, the number of shooting cases almost doubled to 50 and the Burglary cases are 2300 a month from 1800 in April 2012.

There are lot of seasonal pattern in the data, as overall crime is highest in July every year, and lowest in Dec and Jan, one of the reason is weather, due to cold weather, crime does slow down. Yet the number of shooting cases are maximum in Oct for some reason. and Burglaries are highest in July and August. Overall things in Portland get worse in the second half of every year.

The analysis here left some conclusions, yet there is no change in Crime in Portland today, I decided to take a stand and further found a way to contribute to this issue. NIJ launched a Crime Challenge. The goal of the challenge are given below.

  1. Encourage "nontraditional" crime forecasting researchers to compete against more "traditional" crime forecasting researchers.
  2. Compare available crime forecasting methods.
  3. Improve place-based crime forecasting.

My next step is to Use kNN algorithm to train my data set and build a predictive model which can help the police department to utilize there resources heavily in regions which would more likely to be active crime regions for that day. The efficiency and accuracy of algorithm could be upto a period of 3 months until the pattern in crime begins to change again and then we would require a new data set for training  a new model.

The skills the author demonstrated here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.

About Author

Arjun Singh Yadav

Arjun received his Bachelors Degree in Mechatronics Engineering from SRM University in India. Soon after which he competed in DARPA to build a autonomous vehicle to help blind and disabled where he used Python based algorithm to learn...
View all posts by Arjun Singh Yadav >

Related Articles

Machine Learning
Pandemic Effects on the Ames Housing Market and Lifestyle
Python
CitiBike Supply and Demand in NYC
R Shiny
Making US Crime Data Accessible with R Shiny
Meetup
Examining Digital Connectivity in Kenya's 2019 Census Data
R Shiny
R Shiny Global Power Plant Tracker

Leave a Comment

Cancel reply

You must be logged in to post a comment.

No comments found.

View Posts by Categories

All Posts 2399 posts
AI 7 posts
AI Agent 2 posts
AI-based hotel recommendation 1 posts
AIForGood 1 posts
Alumni 60 posts
Animated Maps 1 posts
APIs 41 posts
Artificial Intelligence 2 posts
Artificial Intelligence 2 posts
AWS 13 posts
Banking 1 posts
Big Data 50 posts
Branch Analysis 1 posts
Capstone 206 posts
Career Education 7 posts
CLIP 1 posts
Community 72 posts
Congestion Zone 1 posts
Content Recommendation 1 posts
Cosine SImilarity 1 posts
Data Analysis 5 posts
Data Engineering 1 posts
Data Engineering 3 posts
Data Science 7 posts
Data Science News and Sharing 73 posts
Data Visualization 324 posts
Events 5 posts
Featured 37 posts
Function calling 1 posts
FutureTech 1 posts
Generative AI 5 posts
Hadoop 13 posts
Image Classification 1 posts
Innovation 2 posts
Kmeans Cluster 1 posts
LLM 6 posts
Machine Learning 364 posts
Marketing 1 posts
Meetup 144 posts
MLOPs 1 posts
Model Deployment 1 posts
Nagamas69 1 posts
NLP 1 posts
OpenAI 5 posts
OpenNYC Data 1 posts
pySpark 1 posts
Python 16 posts
Python 458 posts
Python data analysis 4 posts
Python Shiny 2 posts
R 404 posts
R Data Analysis 1 posts
R Shiny 560 posts
R Visualization 445 posts
RAG 1 posts
RoBERTa 1 posts
semantic rearch 2 posts
Spark 17 posts
SQL 1 posts
Streamlit 2 posts
Student Works 1687 posts
Tableau 12 posts
TensorFlow 3 posts
Traffic 1 posts
User Preference Modeling 1 posts
Vector database 2 posts
Web Scraping 483 posts
wukong138 1 posts

Our Recent Popular Posts

AI 4 AI: ChatGPT Unifies My Blog Posts
by Vinod Chugani
Dec 18, 2022
Meet Your Machine Learning Mentors: Kyle Gallatin
by Vivian Zhang
Nov 4, 2020
NICU Admissions and CCHD: Predicting Based on Data Analysis
by Paul Lee, Aron Berke, Bee Kim, Bettina Meier and Ira Villar
Jan 7, 2020

View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day ChatGPT citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay football gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income industry Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI

NYC Data Science Academy

NYC Data Science Academy teaches data science, trains companies and their employees to better profit from data, excels at big data project consulting, and connects trained Data Scientists to our industry.

NYC Data Science Academy is licensed by New York State Education Department.

Get detailed curriculum information about our
amazing bootcamp!

Please enter a valid email address
Sign up completed. Thank you!

Offerings

  • HOME
  • DATA SCIENCE BOOTCAMP
  • ONLINE DATA SCIENCE BOOTCAMP
  • Professional Development Courses
  • CORPORATE OFFERINGS
  • HIRING PARTNERS
  • About

  • About Us
  • Alumni
  • Blog
  • FAQ
  • Contact Us
  • Refund Policy
  • Join Us
  • SOCIAL MEDIA

    ยฉ 2025 NYC Data Science Academy
    All rights reserved. | Site Map
    Privacy Policy | Terms of Service
    Bootcamp Application