Data: Improper handling and Chemical Spills
The skills I demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.
Do you know what's in your water?
Improper handling and storage of petroleum, hazardous substances/chemicals, or liquefied natural gas (LNG) can result in spills that threaten the environment or pose health and safety risks to nearby persons. Across New York State, there have been instances of spills of petroleum or chemicals that have caused groundwater contamination including some public water supplies. Places like Flint Michigan wallow in misery because their local governments cannot limit groundwater contamination.
To help failing governments and to inform the citizenry, I created a Shiny app that visualizes all the spills that have occurred in New York State since the state began keep records. The app aims to provide both an intuitive interface for the exploration of the data and offer rigorous analytical summaries of spills, Department of Environmental Conservation (DEC) responsiveness, and institutions or companies tied to the most dumping of chemicals into the commons. All of it organized by region, county, and type of material.
The interactive map encourages user exploration and investigation of chemical spills that have occurred in NY State. Each spill is drawn with a circle that grows with the logarithm of the spill size. Spills and linked facilities can be selected by the material chemical involved and stored respectively. Spills can also be filtered by size and year of occurence. Hovering over a facility or a spill will bring up useful summary information. In addition, facilities can be grouped and toggled based on their status: closed, inactive, or active.
For a quick walkthrough, we're going to head over to the NYC area and select all size spills of Hazardous Materials for all chemical types from year 2000 to 2020.
The analysis tab showcases a summary per DEC region, County, and Material Family of total chemical spills by volume. Significant for policymakers - the "Spill Sources" tab breaks down the total spills for the particular set of counties by source. Certain areas in NY state with a lot of industrial activity will have a higher percentage of their spills coming from Industrial and Storage sources compared to a rural region - where most spills are from Vehicles or Municipal sources.
The density, distribution, chemical type, and source of spills are given on the Analysis tab of the app. Below is a generative graphic from a particular user selection:
Each chemical type is presented in its bulk spill volume over the user-selected time period and location. In this case it depicts our selection with the interactive UI of all Hazardous Chemical spills of all sizes in NYC from 2000 to 2020. We notice low volumes of Ferric Chloride spills throughout the time period and for higher volumes, Sodium Hypochlorites begins to dominate.
This tab shows the "Case Lag" per DEC Region and County. Case Lag is the amount of time from when a spill is reported to the DEC Regional office to the time a case is closed. Closing a case entails processing the spill, organizing a cleanup, and administration of reopening of the contaminated site. The chart below, for example, shows the mean case lag (in red) for the NYC region over a period of 30 years. Number of individual spill incidents is represented in purple. We notice a sharp peak around Sept 11, 2001 as the city bureaucracy was dealing with the physical and chemical fallout of that event.
The amount of statistical insight form this tool is limited. Though an avenue for future work - the dataset itself would need to be enriched by particular company financial data, NYS administrative data, etc. to truly produce actionable insights. What this tool CAN do is serve the public understanding of what the NYS Department of Environmental Conservation does and the case load it deals with.