Data Study on the History of Gun Violence
The skills the author demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.
Gun violence is a polarizing subject in our country, data has shown. It is impossible to traverse social pathways without being expected to submit an opinion at some point. Having been in many conversations about various shootings over the years, I can usually tell where the conversation is headed: a seemingly endless back-and-forth between conversationalists about the 2nd Amendment, gun regulations, mental health, race, and, of course, getting rid of guns altogether.
I have often wondered what the effects of gun regulations really have on gun violence in our country, and I’ve been curious about the demographics of the shooters who are involved. Do they tend to be from poorer areas of the United States? Are the majority of them of a particular race? Is gun violence more prevalent in states whose majority voted for Donald Trump? Are most of them actually mentally ill, as the media would have you believe?
With the work I put into this application, I was able to get a better idea of how to find answers to some of these questions. You can see the full visualization here.
Origin of the Data Set
There are two main data sets used in this presentation: General Gun Violence and Mass Shootings.
The general gun violence data set used consists of aggregated values from 2016 grouped by state that include number of incidents per population, number of victims, median household income, number of regulations, and voting percentages of Hillary Clinton and Donald Trump. It was constructed from large set found on Kaggle, and the rest was taken from census data and Wikipedia. The mass shootings data mainly comes from Kaggle and goes back to 1966, but I only focus on 5 consecutive years from 2013 to 2017. This decision was made for simplicity.
Results
The first tab successfully illustrates the number of gun regulations and incidents by state. The user is able to switch between the two and see that states with a high number of regulations show a lower number of incidents per capita.
Correlations
The correlation tab allows the user to visualize the correlation between different variables in the data set. The options were originally meant to be number of gun regulations, number of violent incidents, and median household income by state. However, I decided to add the vote percentage per state for both Hillary Clinton and Donald Trump for 2016 just to see if anything interesting showed up.
Generally, I think it can be seen that there is a negative correlation with gun regulation and number of incidents, and you can certainly see a negative correlation with the median household income and vote percentage for Donald Trump.
Mass Shooting Demographics
While looking through different data sets, I found one that focused on mass shootings going back to 1966. The United States' Congressional Research Service acknowledges that the definition of a mass shooting will vary.
In this data set, the minimum number of victims to qualify is three. Variables in this set included race, gender, and 'mental health issues.' I saw an opportunity to allow the user to compare five different years to see how the demographics changed. I chose to focus on a recent five-year range for simplicity, and also because there were holes in the dates. As expected, most mass shootings in that time have been perpetrated by one or two males. Also, as one might expect, the main two races behind mass shootings have been either white or black.
Mental health is a widely acknowledged factor in the discussions involving mass shootings. The mental health charts are constructed from a variable with three possibilities: yes, no, and unclear. In order to analyze the results from the mental health charts, however, one would need to know how these labels were assigned. The data set is lacking that information.
Data Tables
The data page allows the user to peruse the data sets used in the construction of the project. The mass shooting data set is particularly interesting, because it includes a summary for each mass shooting. Each table is searchable in more than one way, and you are encouraged to check them out.
What's Next?
Clearly, there is a lot of work that can be done with this data set in order to further investigate some of the original questions that led me to researching this topic. It would be interesting to group the larger, more general gun violence data set by city, county, age group, or even congressional voting district. To have done this would have required more time spent cleaning multiple data sets to match statistics.
Thank you for taking the time to read my post. I hope you found the shiny app to be at least an inspiration to make further investigations. If you have any questions or suggestions, please don't hesitate to reach out at:
https://github.com/CpInCpp/ShinyProject