Watt's the Point: Data Tracking Building Energy in the US
The skills the author demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.
We don't know enough about howย buildings use energy. Granted, we know some things: on aย large scale, the US Department of Energy's Energy Information Administration releases data on estimates of energy consumption at both the Federal and State level, while at a smaller scale, energy modeling can estimate the energy consumption of any individual building. And yet, given that buildings consume almost 75% of all electricity generated in the US and accounts for nearly 40% of all greenhouse gas (GHG) emissions in the country, we know far too little about how buildings consume energy.
About a decadeย ago, progressive cities and states in the US begun passingย legislation that requires the owners ofย large buildings to measure and report buildingย energy consumptionย on an annual basis. ย Cities then, in most cases, compile and release these data on open data portals. My projectย analyzed some of this publicly available data, exploring trends in building energy consumption and building a proof-of-concept to see if building energy data across the country could be easily compiled into a central, easy to use repository.
Enter the Building Energy Dashboard:
Ideally, buildings in the United States would be very energy efficient. Wasted energy is, in essence, wasted money. Efficient buildings don't waste energy. Having energy efficient buildingsย wouldย save money for business and households alike, and reduce GHG emissions. And yet, buildings in the United States are notย very efficient! Why are our buildings inefficient? It's complicated!ย Buildingsย range in size, climate, use, and occupancy. It's really, really difficult to create a commonย standard of energy consumption for buildings to meet.
About 20 cities currently have regulations requiring building energy dataย transparency on the books, and roughly half of these cities have reached a pointย where they are releasingย useable open data sets. Some cities have already used their energy transparency data to create awesome visualization tools-- shoutout to Philadelphia's Mayor's Office of Sustainability for their excellentย Building Benchmarking tool. However, most cities do not take the extra step to visualize their results, and instead release CSV files on their Open Data portal.
For my project, I attempted a proof-of-concept to see if data from multiple cities could be brought together to create a centralized tool for energyย transparency data.ย Myย project involved downloading, cleaning, and combining data from San Francisco, New York City, and Washington, DC.
Does Location Influence Building Efficiency?ย Is Location a Proxy for Other Buildingย Characteristics?
After combining energy data from each city into a tidy format, Iย used leafletย to plot building energy data onto a map of each city. Buildingย location was not always available, so I aggregatedย buildingsย byย ZIP Code.ย Not having building location data did limit analysis somewhat, but still allowed for me toย see if differentย neighborhoodsย had noticeable differences in energy consumption. Each ZIP Code is represented on the mapย by a circle whose color represents how much energy an average building in the area used, and whose size represents the number of buildings that reported in that ZIP Code during a given year.
It is likely that location does not causeย buildings to be energy efficient or not. More likely is that buildings in a given neighborhood tend to cluster in their age, size, and use, all of which tend to influence building energy consumption. So, while the mapped portion of my tool was interesting to look through, it was most useful as a starting point to identify possible trends, which I would explore in another area of my app.
Can we identify building efficiency trends on a city-wide level?
Each year, an increasingly large number of cities are releasing data on building energy consumption. This growth of open data is generally positive-- however, this growth leads to a need for user-friendly tools to help people explore, understand, and identify trends within large data sets.
Using theย second part of my app, the simply named Data Explorer, users can visualizeย data themselves using custom parameters.


A scatterplot of building size vs energy intensity. Each point is a buildingย from either DCย or San Francisco.
One issue I encountered when plotting these data (about 100,000 observations in total, across the 5 years of data collected from 3 cities) was thatย the data appeared "clumped." That is, while the data spanned a large set of values, the majority of the data fit within a smaller range of observations.ย When thinking about buildings, thisย is intuitive-- while there are some very large buildings (airports and skyscrapers) and big energy users (manufacturing facilities and data centers),ย most buildingsย fall into a small range of size and uses. To help identify trends among this clumped data, I added the ability to overlay a trendline for each cityย displayed in the data explorer:
What can we tell from this? I have my ideas, but more importantly, the tool is set up for any user to come in and draw their own conclusions from the data.
Some Other Features
Not all users will be interested in energy consumption data on a national scale. In fact, I anticipate that most users of a tool like this would be interested inย getting information aboutย one particular city (either as a resident, prospective investor, or policy maker). To this end,ย a user can use the "City Explorer" to get a high-level overview of energy disclosure and use within a given city:


A City-Level View of Washington, DC
This city-specific view can provide a variety of insights, but for this blog, I'm going to focus on the violin plotย in the bottom right-hand corner.


A violin plot for Washington, DC. Hospitals have the greatest variety in energy intensity, while offices and educational buildings have more uniform energy consumption.
How do different property types use data?
Using a violin plot, we can quickly see how buildings that have different uses (essentially, differentย property types) consume energy differently.ย A violin plot is similar to the more common box-and-whisker plot.ย On this plot, each building typeย is represented by a different color. The y-axis of this plot is energy use per square foot,ย whichย allows us to compare buildings of different sizes.ย Each value on the y-axis is measuring how much energy is used per square foot (a unit of size) within a building.
Let'sย focus on the red Education property type on the left. In a violin plot, the width of a shape represents how much data falls within a given range being measured. In this case, the red Education plot is very wide towards the bottom of the shape, atย an energy use per square foot value (the y-axis) of about 125 kBtu/ft2.ย This part of the shape being very wide means that most educationalย buildings haveย similar energy use per square foot, so they are grouped together in the wide part of the plot. By contrast, the yellow Hospital shape shows that hospitals have greater variance in how much energy they use, as the yellow Hospital shape is long and narrow.
What's Next?
Moving forward, this tool could be a starting point for an aggregated visualization platform for all 20+ cities who have passed energy benchmarking regulations across the country. Quite simply, buildings are complicated. Improving building efficiency is going to be an longย process, requiring many actors and localized knowledge to drive actionable insights. In some cases, great research on building efficiency data has already been done. However, as more cities release energy data in the coming years, it will be increasingly important for end-users to haveย an accessible platform to view building energy data. By enabling usersย see when, where, and how our buildings use energy, this tool can help peopleย understandย whyย and ifย their buildings areย energy efficient.


Cities that have passed building benchmarking regulations as of 12/2016. Image from the Institute for Market Transformation, a non-profit based in Washington, DC.