Watt's the Point: Data Tracking Building Energy in the US
The skills the author demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.
We don't know enough about howΒ buildings use energy. Granted, we know some things: on aΒ large scale, the US Department of Energy's Energy Information Administration releases data on estimates of energy consumption at both the Federal and State level, while at a smaller scale, energy modeling can estimate the energy consumption of any individual building. And yet, given that buildings consume almost 75% of all electricity generated in the US and accounts for nearly 40% of all greenhouse gas (GHG) emissions in the country, we know far too little about how buildings consume energy.
About a decadeΒ ago, progressive cities and states in the US begun passingΒ legislation that requires the owners ofΒ large buildings to measure and report buildingΒ energy consumptionΒ on an annual basis. Β Cities then, in most cases, compile and release these data on open data portals. My projectΒ analyzed some of this publicly available data, exploring trends in building energy consumption and building a proof-of-concept to see if building energy data across the country could be easily compiled into a central, easy to use repository.
Enter the Building Energy Dashboard:
Ideally, buildings in the United States would be very energy efficient. Wasted energy is, in essence, wasted money. Efficient buildings don't waste energy. Having energy efficient buildingsΒ wouldΒ save money for business and households alike, and reduce GHG emissions. And yet, buildings in the United States are notΒ very efficient! Why are our buildings inefficient? It's complicated!Β BuildingsΒ range in size, climate, use, and occupancy. It's really, really difficult to create a commonΒ standard of energy consumption for buildings to meet.
About 20 cities currently have regulations requiring building energy dataΒ transparency on the books, and roughly half of these cities have reached a pointΒ where they are releasingΒ useable open data sets. Some cities have already used their energy transparency data to create awesome visualization tools-- shoutout to Philadelphia's Mayor's Office of Sustainability for their excellentΒ Building Benchmarking tool. However, most cities do not take the extra step to visualize their results, and instead release CSV files on their Open Data portal.
For my project, I attempted a proof-of-concept to see if data from multiple cities could be brought together to create a centralized tool for energyΒ transparency data.Β MyΒ project involved downloading, cleaning, and combining data from San Francisco, New York City, and Washington, DC.
Does Location Influence Building Efficiency?Β Is Location a Proxy for Other BuildingΒ Characteristics?
After combining energy data from each city into a tidy format, IΒ used leafletΒ to plot building energy data onto a map of each city. BuildingΒ location was not always available, so I aggregatedΒ buildingsΒ byΒ ZIP Code.Β Not having building location data did limit analysis somewhat, but still allowed for me toΒ see if differentΒ neighborhoodsΒ had noticeable differences in energy consumption. Each ZIP Code is represented on the mapΒ by a circle whose color represents how much energy an average building in the area used, and whose size represents the number of buildings that reported in that ZIP Code during a given year.
It is likely that location does not causeΒ buildings to be energy efficient or not. More likely is that buildings in a given neighborhood tend to cluster in their age, size, and use, all of which tend to influence building energy consumption. So, while the mapped portion of my tool was interesting to look through, it was most useful as a starting point to identify possible trends, which I would explore in another area of my app.
Can we identify building efficiency trends on a city-wide level?
Each year, an increasingly large number of cities are releasing data on building energy consumption. This growth of open data is generally positive-- however, this growth leads to a need for user-friendly tools to help people explore, understand, and identify trends within large data sets.
Using theΒ second part of my app, the simply named Data Explorer, users can visualizeΒ data themselves using custom parameters.
One issue I encountered when plotting these data (about 100,000 observations in total, across the 5 years of data collected from 3 cities) was thatΒ the data appeared "clumped." That is, while the data spanned a large set of values, the majority of the data fit within a smaller range of observations.Β When thinking about buildings, thisΒ is intuitive-- while there are some very large buildings (airports and skyscrapers) and big energy users (manufacturing facilities and data centers),Β most buildingsΒ fall into a small range of size and uses. To help identify trends among this clumped data, I added the ability to overlay a trendline for each cityΒ displayed in the data explorer:
What can we tell from this? I have my ideas, but more importantly, the tool is set up for any user to come in and draw their own conclusions from the data.
Some Other Features
Not all users will be interested in energy consumption data on a national scale. In fact, I anticipate that most users of a tool like this would be interested inΒ getting information aboutΒ one particular city (either as a resident, prospective investor, or policy maker). To this end,Β a user can use the "City Explorer" to get a high-level overview of energy disclosure and use within a given city:
This city-specific view can provide a variety of insights, but for this blog, I'm going to focus on the violin plotΒ in the bottom right-hand corner.
How do different property types use data?
Using a violin plot, we can quickly see how buildings that have different uses (essentially, differentΒ property types) consume energy differently.Β A violin plot is similar to the more common box-and-whisker plot.Β On this plot, each building typeΒ is represented by a different color. The y-axis of this plot is energy use per square foot,Β whichΒ allows us to compare buildings of different sizes.Β Each value on the y-axis is measuring how much energy is used per square foot (a unit of size) within a building.
Let'sΒ focus on the red Education property type on the left. In a violin plot, the width of a shape represents how much data falls within a given range being measured. In this case, the red Education plot is very wide towards the bottom of the shape, atΒ an energy use per square foot value (the y-axis) of about 125 kBtu/ft2.Β This part of the shape being very wide means that most educationalΒ buildings haveΒ similar energy use per square foot, so they are grouped together in the wide part of the plot. By contrast, the yellow Hospital shape shows that hospitals have greater variance in how much energy they use, as the yellow Hospital shape is long and narrow.
What's Next?
Moving forward, this tool could be a starting point for an aggregated visualization platform for all 20+ cities who have passed energy benchmarking regulations across the country. Quite simply, buildings are complicated. Improving building efficiency is going to be an longΒ process, requiring many actors and localized knowledge to drive actionable insights. In some cases, great research on building efficiency data has already been done. However, as more cities release energy data in the coming years, it will be increasingly important for end-users to haveΒ an accessible platform to view building energy data. By enabling usersΒ see when, where, and how our buildings use energy, this tool can help peopleΒ understandΒ whyΒ and ifΒ their buildings areΒ energy efficient.