Data Analyzing Mass Shooting in US

Posted on Jul 20, 2016
The skills the author demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.


Data shows United State is the country with the most mass shooting in the world. From 1966 to 2012, nearly a third of the world's mass shootings took place in the U.S. More than 200 mass shooting killings have occurred since 2006 and this trend is not slowing down.  On June 12, 2016, United States suffered its worst mass shooting in its modern history when 50 people were gunned down and 53 were injured after gunman stormed into a packed gay nightclub in Orlando. The frequent occurrence of Mass shootings in US is becoming a disturbing trend that is showing no sign of slowing down. Schools, movie theaters, shopping centers, and residential areas are some of the areas that have been targeted by shooters.


The purpose of this study is to gain some insights on mass shootings by performing spatial and statistical analysis on mass shooting data. This data exploration project sets out to visualize the occurrence of mass shootings in the United States. It will investigate the occurrence of mass shootings in the US from 1966 to 2016. A closer look will be taken at recent mass shootings from 2014 and 2015. This project will also try to visualize the injuries and fatalities that have occurred as a result of mass shootings.

This visualization project is divided into 2 parts. The first part will analyze the mass shooting occurrence from 1966 to 2016 in US. The data used for this section was downloaded from Stanford University Library Mass Shooting in America (MSA) database. The data-set consists of 50 variables. The second part of this project will focus on the recent occurrence of Mass Shootings from 2014.

A comparison study will be performed on mass shooting incidents for the year 2014 and 2015.The data was collected from Gun Violence Archive which is a nonprofit corporation formed in 2013 to provide free online public access to accurate information about gun-related violence in the United States. The states of Hawaii and Alaska have not been considered for this study to maintain map aesthetic.


This visualization project will investigate the following questions regarding mass shooting in US

  • Which states had a higher number of mass shooting incidents since 1966 and how many people have been killed or injured so far?
  • What types of guns were more commonly used during these mass shooting incidents?
  • Which sex and race was more involved in these shootings?
  • Which places have been targeted often by the shooters and the motive behind these mass shootings?
  • What new insights can be found by comparing mass shooting data for the years 2014 and 2015?

Part 1 - Data on Mass Shootings from 1966 to 2016

In this section, a study was conducted to compare the occurrence of mass shooting incidents by states and the number of injuries and fatalities as a consequence of these incidents. The data used for this study was collected from the Stanford University Library Mass Shootings in America (MSA) database. The definition of mass shooting adopted by the Stanford University Library database is any incident where there are 3 or more shooting victims (not necessarily fatalities), not including the shooter and the shooting must not be identifiably gang, drug, or organized crime related.

1.1 Mass Shooting Incidents by State (1966 - 2016)

The occurrence of mass shooting incidence was grouped by State and is displayed in choropleth map as shown below.  The darker shades of color depict the higher number of occurrences of mass shootings and the red points map the location where the incident occurred. The purpose of the choropleth map is to show the absolute count depicting which state had higher occurrences of mass shootings since 1966. The mass shooting data has not been normalized by population for this study.

Looking at the maps, it can be deduced that states like California, Texas, Florida and Georgia have comparatively higher numbers of mass shootings than other states. These higher numbers of occurrences can be attributed to the higher population these states have.   Data Analyzing Mass Shooting in US

The choropleth maps below shows the total number of deaths and injuries resulting from the mass shooting incidents since 1966.  The states that have the higher number of mass shootings also have higher numbers of deaths and injuries. Interestingly, Washington State had a relatively high number of deaths even though it had a relatively low incidence of mass shooting.

For Colorado and New York, the mass shooting injuries were high, though the number of mass shooting occurrence was relatively low. The number of deaths and injuries were not normalized by the population for this study. Therefore the higher number of deaths and injuries can be attributed to the fact that these states have larger populations.

Data Analyzing Mass Shooting in USm24

1.2 Mass Shooting based on different categories

Investigations were conducted to find insights on mass shooting incidents based on different variables available in the data-sets. The variables taken into consideration were sex, race, gun type used, shooter's motive and location.

1.2.1 Mass Shooting based on type of gun used:

A study was performed to investigate the type of guns that were commonly used during most mass shootings. The data was grouped based on the general gun types which were divided into five different classes. The bar- plot below shows the type of gun that was used in most of the mass shooting and the number of victims that resulted from these incidents. Looking at the bar-plot, it can be deduced that handguns were more frequently used by the shooters during these shootings. Moreover, handguns have been responsible for more deaths and injuries than any other type.


1.2.2 Mass Shooting classification by Sex

A study was performed to investigate which sex has been more responsible for carrying out these mass shootings. The data was grouped into 4 classes as shown in the bar-plot. Looking at the plot, it can be inferred that the shooters have predominantly been male. They lead the numbers by a very large margin compared to the female and other classes.


1.2.3 Mass Shooting classification by Race

A study was done to investigate which race has been more responsible for carrying out these mass shootings. The data was grouped and only the top ten classes were selected. From the plot, it can be inferred that the White American or European Americans have been responsible for most of the mass shootings followed by Black American or African American. They lead in numbers by a very wide margin compared to the other classes and have been responsible for more deaths and injuries.


1.2.4 Mass Shooting Classification by Type of Place

A study was done to investigate which types of locations have been targeted more by the shooters during mass shooting incidents. The data was grouped by the type of place and only top ten classes were taken into consideration. The bar plot shows that residential home/ neighborhood had more mass shooting than any other place.  Schools were categorized into different subclasses. Had it been categorized into just one group, the number of occurrences of mass shootings would have been closer to the numbers shown in residential areas.



1.2.5 Mass Shooting Classification by Motive of the shooter

A study was done to investigate what were the reasons behind these horrific mass shooting incidents. In most cases the motive behind the mass shootings was unknown followed by mental illness of the shooters. There were also many cases of domestic and social disputes which led to these mass shootings.


Part 2 - Comparing Mass Shooting Incidents Data (2014 - 2015)

This section analyzes the mass shooting incidents that occurred from 2014 to 2015 in United States. The data used for this study was downloaded from Gun Violence Archive website. Gun Violence Archive has defined mass shooting as any incident where four or more people are wounded or killed.

2.1  Mass Shooting Deaths (2014 - 2015)

The spatial locations of mass shooting are displayed on top of the choropleth map which has been classified based on absolute number of gun deaths by state. Comparing the two maps, it can be seen that both years had significant numbers of mass shootings. According to the data, there were 277 separate mass shooting incidents in 2014 compared to 332 incidents in 2015.

There was a rise in mass shooting incidents and number of deaths in many states in 2015. The maps have not been normalized according to the population and represent the absolute number of deaths. In both years, states like California, Texas, Florida and Georgia had high numbers of deaths which can be attributed to their high population. While the number of deaths decreased for Florida in 2015, the numbers increased for Ohio and South Carolina.


2.2  Normalizing Mass Shooting Deaths with population data (2014-2015)

A study was conducted to investigate the mass shooting deaths in US after normalizing it with respective state population for that year. The population data for 2014 and 2015 for each state was obtained from the U.S. Census Bureau website and merged into the mass shooting data-sets. The mass shooting deaths for each state was normalized by population and choropleth map was produced. Although states like California and Florida had a very high number of absolute mass shootings for both 2014 and 2015, the number of deaths per 100 million is not as high as states like Georgia, Louisiana, South Carolina and other states with darker shades of blue.


2.3  Heat map of Mass shooting (2014 - 2015)

Heat maps were created to display the pattern of mass shootings for the year 2014 and 2015.  The more mass shooting incident points accumulate in certain areas, the redder those areas become.  The mass shooting incident point data is analyzed in order to create an interpolated surface showing the density of occurrence. The raster cells created are assigned density values and the entire layer is visualized using a gradient.

The maps below show that red areas (spots) have been formed mostly near big cities. The darker red spot seems to be near Chicago (IL) where gun violence rates are very high. Similarly, there are red spots near Los Angeles (CA), Atlanta (GA) and places on the East Coast. In both heat maps, the pattern is almost similar and the same areas are found to have a higher number of mass shooting incidents.


2.4  Comparing Deaths and Injuries by state (2014 - 2015)

In this section, the number deaths and injuries resulting from the mass shootings are compared for each state. From the bar-plot, it can be deduced that there has been a slight increase in the number of deaths and injuries for the year 2015 compared to 2014. There has been a slight increase in number of casualties for states like California, Virginia, Illinois, and Washington State. Some states, however, had fewer number of mass shooting deaths and injuries than the previous year. States like Texas, New York, Louisiana and Pennsylvania had fewer deaths and injuries in 2015 than 2014.

Overall, it can be noted that there has been a very slight increase in deaths and injuries from 2014 to 2015. There were total of 363 deaths and 1317 injuries in 2015 compared to 268 deaths and 1104 injuries in 2014.



To sum up, it can be concluded that mass shooting incidents in states like California, Florida, Georgia and Texas are very high with high number of fatalities and injuries. This can be attributed to the fact that these states have larger populations. However when normalizing the data with population for the 2014 and 2015 data, states like Georgia, Louisiana, South Carolina had a higher number of deaths compared to California and Texas. There was a slight increase in the number of mass shootings for 2015 compared to 2014. States like California, Georgia, Illinois and Florida had a higher number of deaths and injuries for both years.

In most of the mass shooting incidents since 1966, handguns have been used more frequently than any other gun type. The shooters have been predominantly male with most of them being White American or European American followed by Black American or African American. Women very rarely engage in mass shootings. In terms of motive, in most cases this was unknown followed by a mental illness condition of the shooter. The majority of the mass shootings took place in residential neighborhoods followed by schools and campuses.

Recent Years

In recent years, mass shootings in the United States seem to be in an increasing trend with President Obama also recognizing it as becoming routine for Americans. He has time and again mentioned that Americans have become numb to these incidents and reiterated that there should be an amendment to the gun laws and background checks. The pattern of mass shootings in the US has no parallel to any other country in the world.  This may be because it is so easy for people to get guns.  Australia confiscated millions of firearms and made strict guns law after a mass shooting incident on April 28, 1996, which took lives of 35 people.

The country has not had any such incidence since then which is a remarkable achievement. Isn't it time for America to take such a drastic step? How many more innocent lives will have to be sacrificed before we finally realize that guns make it easier for these maniacs to inflict harm on other people. Only time will tell...


About Author

Samriddhi Shakya

Samriddhi comes from a Remote Sensing and Geographic Information Systems (GIS) background. He has a Master’s degree in Geography from Auburn University and Bachelors of Engineering degree in Geomatics from Kathmandu University. During his Masters at Auburn University,...
View all posts by Samriddhi Shakya >

Leave a Comment

No comments found.

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI