United States Executions Web App

Gordon Fleetwood
Posted on Nov 3, 2015

Contributed by Gordon. Gordon took NYC Data Science Academy 12 week full time Data Science Bootcamp program between Sept 23 to Dec 18, 2015. The post was based on his second class project(due at 4th week of the program).

The Problem: Interesting Data, Uninteresting Use

The Death Penalty Information Center provides new and resources about executions - both past and upcoming - in the United States. One of its best features is a database containing information on every executed person since 1977. The database has an interface which allows filtering by several categories, and a download option to acquire a csv of the data generated. However, descriptive statistics and comparisons are either absent or hidden in a pdf.

My goal was to make an app that proves to be more inclusive of the data; i.e, providing a platform for both the raw data and statistics to be displayed in the same interactive environment. R Shiny provides a great platform to build such an environment that could grow into a web app for the general public.

Methodology

Unexpectedly, the data had a lot of missing values. Given its nature of the data, however, imputation was off the table, as was any discarding. The cleaning of the data was reduced to separating dates into month, day, and year, and partitioning the column of the victims of each category of execution by sex. Ultimately, I didn't use this data, but it could be integrated in the future.

The App

The app consists of three main pages, the last of which is an "About" page.

The first page displays a state map of the executions per state. I used dplyr to to wrangle the data and the Plotly API to generate an interactive choropleth map.

1

And this is the code which produced this plot:

output$map = renderPlotly({
l = list(color = toRGB("white"), width = 2)
g = list(scope = 'usa', projection = list(type = 'albers usa'))

plot_ly(state.executions, z = count,locations = State, type = 'choropleth',
locationmode = 'USA-states', color = count, size = 10, colors = 'Purples',
marker = list(line = l), colorbar = list(title = "Number of Executions"),
filename="r-docs/usa-choropleth") %>%
layout(title = 'Executions In The U.S.A Since 1977
(Hover for Numbers By State)',
geo = g)
})

This is direct usage of examples provided by Plotly page's on using the service with R.

The second page allows the user to do some exploration of the data. A menu on the left allows one to choose which state's data to display, the year, the method of execution, and the race of the person executed. The filtered results are shown in a table in the first tab to the right of the menu.

2

What was most interesting here was that R's data table object is Javascript generated on the backend. That allowed me to customize to my heart's desire by removing pagination and filtering and adding a scroll bar--among other things.

output$table <- renderDataTable({
data = executions[,c(2,3,5,8,18,6)]
if (input$st != "All"){
data = data[data$State == input$st,]
}
if (input$yr != "All"){
data = data[data$Year == input$yr,]
}
if (input$md != "All"){
data = data[data$Method == input$md,]
}
if (input$rc != "All"){
data = data[data$Race == input$rc,]
}
data},
options = list(searching = FALSE, pageLength=10, lengthChange = FALSE, ordering = FALSE,
scrollY = "310px", scrollCollapse = TRUE, paging = FALSE, info = FALSE)
)

The second tab produces aggregated plots of statistics such as race, method of execution, age, and sex. This data can also be filtered by state.

3

Below lies the central component of this tab.

df2 = reactive({
if (input$st == 'All') executions
else filter(executions, State == input$st)
})

Using a reactive allows for the app to update and generate new graphs and charts based on a user's choices.

The last tab is a simple timeline showing the number of executions per year from 1977 to the present day.

4

I generated this chart using ggplot2.

output$time.series = renderPlot({
#plot_ly(year.executions, x = Year, y = count, name = "Executions Year On Year", filename="r-docs/basic-time-series")
ggplot(data = year.executions, aes(x=Year,y=count)) +
geom_line(colour="darkgreen") +
ylim(0,50) +
ggtitle('Executions Year On Year') +
ylab('Number Executed') +
theme_bw() +
theme(plot.title = element_text(size=20, face="bold", vjust=2),
axis.title.x = element_text(face="bold", vjust=-1),
axis.title.y = element_text(face="bold"))
})

Demo

You can explore the app here and see the code here.

Further Work

Given this base, there is room to make this app into a complete wrapper for the executions database. This would involve finding a way to fully integrate data about victims into the interface. There is also an opportunity to add even more data since the Death Penalty Information Center links to another data set which has executions since 1602. Extensive cleaning of this data would be necessary before integration, however. Finally, the ultimate goal would be full automation, with a script checking for updates to the database and updating the app if new data has been added.

About Author

Gordon Fleetwood

Gordon Fleetwood

Gordon has a B.A in Pure Mathematics and a M.A. in Applied Mathematics from CUNY Queens College. He briefly worked for a early stage startup where he was involved in building an algorithm to analyze financial data. However,...
View all posts by Gordon Fleetwood >

Related Articles

Leave a Comment

No comments found.

View Posts by Categories


Our Recent Popular Posts


View Posts by Tags

#python #trainwithnycdsa 2019 airbnb Alex Baransky alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus API Application artist aws beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep Bundles California Cancer Research capstone Career Career Day citibike clustering Coding Course Demo Course Report D3.js data Data Analyst data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization Deep Learning Demo Day Discount dplyr employer networking feature engineering Finance Financial Data Science Flask gbm Get Hired ggplot2 googleVis Hadoop higgs boson Hiring hiring partner events Hiring Partners Industry Experts Instructor Blog Instructor Interview Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter lasso regression Lead Data Scienctist Lead Data Scientist leaflet linear regression Logistic Regression machine learning Maps matplotlib Medical Research Meet the team meetup Networking neural network Neural networks New Courses nlp NYC NYC Data Science nyc data science academy NYC Open Data NYCDSA NYCDSA Alumni Online Online Bootcamp Open Data painter pandas Part-time Portfolio Development prediction Prework Programming PwC python python machine learning python scrapy python web scraping python webscraping Python Workshop R R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn Selenium sentiment analysis Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau team TensorFlow Testimonial tf-idf Top Data Science Bootcamp twitter visualization web scraping Weekend Course What to expect word cloud word2vec XGBoost yelp