NYC Data Science Academy| Blog
Bootcamps
Lifetime Job Support Available Financing Available
Bootcamps
Data Science with Machine Learning Flagship ๐Ÿ† Data Analytics Bootcamp Artificial Intelligence Bootcamp New Release ๐ŸŽ‰
Free Lesson
Intro to Data Science New Release ๐ŸŽ‰
Find Inspiration
Find Alumni with Similar Background
Job Outlook
Occupational Outlook Graduate Outcomes Must See ๐Ÿ”ฅ
Alumni
Success Stories Testimonials Alumni Directory Alumni Exclusive Study Program
Courses
View Bundled Courses
Financing Available
Bootcamp Prep Popular ๐Ÿ”ฅ Data Science Mastery Data Science Launchpad with Python View AI Courses Generative AI for Everyone New ๐ŸŽ‰ Generative AI for Finance New ๐ŸŽ‰ Generative AI for Marketing New ๐ŸŽ‰
Bundle Up
Learn More and Save More
Combination of data science courses.
View Data Science Courses
Beginner
Introductory Python
Intermediate
Data Science Python: Data Analysis and Visualization Popular ๐Ÿ”ฅ Data Science R: Data Analysis and Visualization
Advanced
Data Science Python: Machine Learning Popular ๐Ÿ”ฅ Data Science R: Machine Learning Designing and Implementing Production MLOps New ๐ŸŽ‰ Natural Language Processing for Production (NLP) New ๐ŸŽ‰
Find Inspiration
Get Course Recommendation Must Try ๐Ÿ’Ž An Ultimate Guide to Become a Data Scientist
For Companies
For Companies
Corporate Offerings Hiring Partners Candidate Portfolio Hire Our Graduates
Students Work
Students Work
All Posts Capstone Data Visualization Machine Learning Python Projects R Projects
Tutorials
About
About
About Us Accreditation Contact Us Join Us FAQ Webinars Subscription An Ultimate Guide to
Become a Data Scientist
    Login
NYC Data Science Acedemy
Bootcamps
Courses
Students Work
About
Bootcamps
Bootcamps
Data Science with Machine Learning Flagship
Data Analytics Bootcamp
Artificial Intelligence Bootcamp New Release ๐ŸŽ‰
Free Lessons
Intro to Data Science New Release ๐ŸŽ‰
Find Inspiration
Find Alumni with Similar Background
Job Outlook
Occupational Outlook
Graduate Outcomes Must See ๐Ÿ”ฅ
Alumni
Success Stories
Testimonials
Alumni Directory
Alumni Exclusive Study Program
Courses
Bundles
financing available
View All Bundles
Bootcamp Prep
Data Science Mastery
Data Science Launchpad with Python NEW!
View AI Courses
Generative AI for Everyone
Generative AI for Finance
Generative AI for Marketing
View Data Science Courses
View All Professional Development Courses
Beginner
Introductory Python
Intermediate
Python: Data Analysis and Visualization
R: Data Analysis and Visualization
Advanced
Python: Machine Learning
R: Machine Learning
Designing and Implementing Production MLOps
Natural Language Processing for Production (NLP)
For Companies
Corporate Offerings
Hiring Partners
Candidate Portfolio
Hire Our Graduates
Students Work
All Posts
Capstone
Data Visualization
Machine Learning
Python Projects
R Projects
About
Accreditation
About Us
Contact Us
Join Us
FAQ
Webinars
Subscription
An Ultimate Guide to Become a Data Scientist
Tutorials
Data Analytics
  • Learn Pandas
  • Learn NumPy
  • Learn SciPy
  • Learn Matplotlib
Machine Learning
  • Boosting
  • Random Forest
  • Linear Regression
  • Decision Tree
  • PCA
Interview by Companies
  • JPMC
  • Google
  • Facebook
Artificial Intelligence
  • Learn Generative AI
  • Learn ChatGPT-3.5
  • Learn ChatGPT-4
  • Learn Google Bard
Coding
  • Learn Python
  • Learn SQL
  • Learn MySQL
  • Learn NoSQL
  • Learn PySpark
  • Learn PyTorch
Interview Questions
  • Python Hard
  • R Easy
  • R Hard
  • SQL Easy
  • SQL Hard
  • Python Easy
Data Science Blog > R Shiny > R Shiny Global Power Plant Tracker

R Shiny Global Power Plant Tracker

Gary Simmons
Posted on Oct 20, 2022

Introduction

The question of how energy should be produced has been hotly contested by different administrations in US history. Some groups want to maintain traditional fossil fuel methods like oil, gas, and coal, while some groups push for more renewable sources like wind and solar. There are also those who have also considered going nuclear. Arguments range between what's more environmentally friendlier, what is more cost effective, and other questions. For this project I focus on  one question:  Which type of power plant  produces more energy?"

To that end,I created The Global Power Plant Shiny App, which gives a survey over a country's energy production over a 5 year period between 2013-2017. It also displays a count of each plant per type and gives a map of the global distribution of plants identified by type. The source of this data is the Global Power Plant Database from the World Resources Institute from  June 2021. The app is divided into two pages. The first page is a country based question page displaying three figures: one displaying total energy produced over the five year span, another figure to illustrate the fluctuations of the two power plant types, and a third figure to count the number of plants of each type. The second page is a type based page, where users can see the locations of power plants of a certain type all over the world.

Using this app, I'm going to be answering the questions above comparing wind versus oil for the United States. With the figures mentioned above, I want to answer the following:

  • Which power plant type produced the most energy?
  • How does  energy production fluctuate over time?
  • How many plants of each type are built?
  • Where are the wind and oil plants  located in the United States?

For added speculation, I investigated where oil and wind plants are located in other countries.  Even though energy efficiency alone should not be the sole factor of choosing a power plant type, these five questions could influence where the country should focus its efforts in producing energy.

Methods

The diagram above explains the process of how each figure is made once the user makes their selection.

I. Data Clean Up

After downloading the datasheet, I replaced nan values with zeros. I then removed all power plants that provided no power plant energy responses.

II. Preparing cleaned data for figures

Method sections II.a-II.c require one to pick a country and filter the power plants belonging to said country. This new data frame can be then grouped by power plant type.

II.a) Energy Accumulated between 2013-2017

The dataset had estimated energy produced for each country up from 2013 to 2017 and actual energy production from 2013-2019. Even after removing non-reporting power plants, some power plants only had estimated data. Other power plants had only actual data. All remaining power plants provided both. To simplify the confusion, I chose to look at 2013-2017 data, and for each year I added up the actual data value for each type and the estimated data value for each type and divided it by two. If a power plant provided both the energy production and the energy estimation, I get a new estimated value for energy produced off by some factor. If the power plant only provided actual or only estimated data, then the energy value is off by a factor of .5. (Reader may recall a favorite STEM factor/magnitude joke at this point). 

In hindsight, all these estimated recalculations could have been avoided by choosing one column and inserting the other column's value, on condition that the chosen column's value is null. Nevertheless, this app uses the calculation previously mentioned. This app should be used for analyzing the power plant energy production within a selected country and not for comparing energy values with other countries.

For this figure I sum up all the newly calculated energy values for each type grouping per year column. After acquiring the previously mentioned sum for each power plant type grouping per year column, I sum the year columns together to get the total energy produced over a five year value for each power plant type for the selected country.

II.b) Energy Produced Over The Years per Type

To acquire this figure, I followed the same procedure as Methods II.a, except instead of adding the values over the five year period, I transformed the grouped data frame to a regular data frame, added year and type labels, and transposed the data frame to make years as columns and types as rows. After all transpositioning and labeling is done, we use a ggplotly line plot to display the energy production evolution over the five year time span. The major benefit in using ggplotly line plots is that one can click on lines in the legend to hide or show certain curves on the plot. For this wind versus oil comparison, I toggled off all other plant type curves and toggled on only oil and wind curves.

III.c) Number of Plants Per Type

To get the number of plants per type, I simply count the number of plants of each type from the grouped data frame.

III.d) Power plant Distributions by Fuel Type

This data set included the latitudes and longitudes of each plant. After removing plants that did not contribute data, I filtered by type and plotted the coordinates with a leaflet map with the following parameters: popup = ~primary_fuel, label = ~country_long, clusterOptions = markerClusterOptions(). The last parameter is important because it gathers all the power plants into clusters. Users hover over the cluster and see a highlighted area that encompasses the power plants that cluster represents.

Results

I. Energy Accumulated between 2013-2017

 

The energy accumulated by various different United States power plant types varies between a few GWH generated by storage to several TWH generated by gas. With regards to the oil versus wind comparison, wind outperforms oil by half a magnitude of 10 (i.e wind production ~ oil production x 10^0.5).

II. Energy Produced Over The Years per Type

Hiding all other power plant types via ggplotly's line plot feature, it can be seen that wind steadily increased in energy production incrementally over the five year period of interest. Oil on the other hand experienced a 60% dip in energy production between 2014 and 2016, but then it tripled within a year after 2016. Nevertheless, wind outperforms oil by over a magnitude and a half (i.e. wind ~ oil x 10^1.5).

III. Number of Plants Per Type

In the United States, petcoke has fewer than 10 power plants. Solar is far ahead  by  over 10^3.5 or just above 3100 plants. Looking at the oil versus wind comparison, wind slightly outperforms solar, but they both have about 1000 plants.

IV. Powerplant Distributions by Fuel Type

In the United States, most oil plants are found in the Great Plains and the Mid-Atlantic regions.

The highest three clusters of wind plants can be found around the Oklahoma Panhandle (hovering over the grouped marker shows that this region include North Texas, almost all of Oklahoma and Nebraska, East Colorado, and East New Mexico), the Great Plains, and the Great Lakes.

Amongst global clusters, most oil clusters are found in the Americas.

There are high clusters of wind plants in the United States, Brazil, India, and China, but the largest wind cluster is found in Europe.

Discussion

The number of plants illustrates that even though there are more wind plants than oil plants, they are close in number at  about 1000 plants each. However, within the five year period wind produces about 10 times more energy than oil does. It should be noted that oil energy production dropped off by 60% over two years prior to 2016 and then tripled within a year after 2016, which might be correlated with a particular political shift in the United States government. Also, it should be noted that wind energy increased steadily without much turbulence.

Looking at the United States map page, half of the oil plants encompass rural areas and half the oil plants are encompassing urban areas. Companies, including power plants, still run on funding from the areas they serve and their budgets. As a thought experiment, let's say these power plant companies either fully shut down to 0% productivity or fully spool up to 100% productivity based on finances. If half the oil plants went down, or conversely the number of fully operational oil plants doubled, it did not explain why oil under-performed wind energy by a factor of 10. When hovering over the three wind cluster markers, the highlighted areas encompass states of different palettes of the political spectrum ensuring bipartisan support.

Globally, plants geographically correlate where one can find the resource. I was expecting more oil plants to be found in the Middle East, though . There are  larger clusters of plants where North Americans and Europeans are still hunting for untapped resources. The low number of oil plants in the Middle East could also be due to the removal of plants that did not provide actual or estimated numbers of their energy production. Wind, on the other hand, appears to be ubiquitous, and first-world countries, the two most populous countries, and Brazil are taking advantage of that fact.

Average Annual Wind Speeds of the United States at 30 m according to NREL

However, how ubiquitous is wind? There are various areas in the country where wind is very abundant that tornadoes are a concern and other areas where one cannot fly a kite. The power generated by wind is proportional to the wind velocity cubed multiplied by the area of the rotor and the air density. For simplicity, weโ€™ll use the air density at sea level, which is about 1.24 kg/m^3 according to the Engineering Toolbox . The average wind turbine blade is 116 ft, or approximately 35.36 meters, according to Utility Dive.  The figure in Results II, dictates that all the United States wind turbines generate 210k +/- 50k MWH per year on average between 2013 and 2017. This means, on average, each plant generates 210 +/- 50 MWH per year, which is roughly 24 +/- 6 kW per plant. Knowing the power each plant generates, the average air density of the United States, and the average blade size, one can calculate that each plant would need the winds to be blowing roughly 2.14 m/s or 4.79 miles per hour. According to the NREL map above dictating average annual wind speeds at the elevation of 30 m, wind turbine energy would do best in the midwest and the yellowish areas East and West of the American Heartland.

Conclusion and Future Works

Worldwide there is a larger number of wind plants than oil plants. In the United States, though, there are about the same  number of wind-based power plants as there are oil-based power plants. When looking at both the time based line plot and the total accumulated energy bar chart, wind energy production out performs oil energy production by a factor of ten. As a reminder this blog is only a comparison of wind and oil in the United States, but the app can be used for different countries and different power plant fuel types.

If I were to continue this project, I would consider adding a boxplot, like the one above, describing the average energy accumulated over the five year investigation period. Also, this study focused on wind versus oil; however, visitors to the app might want to apply the same analytical thought process to coal versus solar.

I originally thought the reason there was a similar number of power plants between wind and oil was due to party oscillation between presidential administrations. However, further conversations with those who reviewed my work revealed that although wind is more ideal than oil, oil is more practical than wind. Those concerned about climate change have expressed that fossil fuels, like coal, are more affordable and mobile than renewable energy in both Europe and the United States,  produce less climate change problems than electric cars, and kill less airborne animals than wind. In fact, nuclear energy is cleaner and more cost efficient than solar or wind.

With that said, energy output is not the only factor in deciding how to fund energy production. Funding sources and worker welfare are also major factors in deciding what plants are built, promoted, demoted, and destroyed. This is because leaders (e.g. politicians, CEOs, etc...) depend on the health and productivity of their working class and the backing of their supporters. Thus, when wondering why a region or country is funding a certain energy worldview, look at all the factors, numerical, environmental, and social.

The Global Power Plant Tracker app is published on https://ggsglobalpowerplanttracker.shinyapps.io/globalpowerplantshinyapp/

References

  • Featured Image (Coal Power Plant): Photo 8853899 / Powerplant ยฉ Danicek | Dreamstime.com
  • Global Power Plant Database: https://datasets.wri.org/dataset/globalpowerplantdatabase
  • Power of the wind: Cube of Wind Speed: http://xn--drmstrre-64ad.dk/wp-content/wind/miller/windpower%20web/en/tour/wres/enrspeed.htm 
  • Engineering Toolbox: US Standard Atmosphere vs Altitude: https://www.engineeringtoolbox.com/standard-atmosphere-d_604.html 
  • Utility Dive: โ€œWind turbine blade sizes and transport: A guideโ€: https://www.utilitydive.com/spons/wind-turbine-blade-sizes-and-transport-a-guide/623444/#:~:text=On%20average%20wind%20turbine%20blades,blades%20approaching%20200%20feet%20long. 
  • U.S. Average Annual Wind Speed at 30 Meters: https://windexchange.energy.gov/maps-data/325
  • Germany Is Dismantling a Wind Farm To Make Way for a Coal Mine: https://oilprice.com/Latest-Energy-News/World-News/Germany-Is-Dismantling-A-Wind-Farm-To-Make-Way-For-A-Coal-Mine.html 
  • Advantages and Challenges of Wind Energy: https://www.energy.gov/eere/wind/advantages-and-challenges-wind-energy 
  • Electric Cars and the Coal that Runs Them: https://www.washingtonpost.com/world/electric-cars-and-the-coal-that-runs-them/2015/11/23/74869240-734b-11e5-ba14-318f8e87a2fc_story.html 
  • Wind Farms Kill 10-20 Times More Previously Thought: https://windmillskill.com/blog/windfarms-kill-10-20-times-more-previously-thought
  • 3 Reasons Why Nuclear is Clean And Sustainable: https://www.energy.gov/ne/articles/3-reasons-why-nuclear-clean-and-sustainable

Github Link

  • https://github.com/GGSimmons1992/globalPowerPlantShinyApp

About Author

Gary Simmons

Open-minded and tenacious data scientist and machine learning programmer familiar with large dataset analysis, Angular user interface enhancement, .NET Core REST API problem solving, and relational database management. My Applied Physics BS, Physics MS, and software development background...
View all posts by Gary Simmons >

Related Articles

Machine Learning
Pandemic Effects on the Ames Housing Market and Lifestyle
Python
CitiBike Supply and Demand in NYC
R Shiny
Making US Crime Data Accessible with R Shiny
Meetup
Examining Digital Connectivity in Kenya's 2019 Census Data
Machine Learning
Accurately Predicting House Prices and Improving Client Experience with Machine Learning

Leave a Comment

No comments found.

View Posts by Categories

All Posts 2399 posts
AI 7 posts
AI Agent 2 posts
AI-based hotel recommendation 1 posts
AIForGood 1 posts
Alumni 60 posts
Animated Maps 1 posts
APIs 41 posts
Artificial Intelligence 2 posts
Artificial Intelligence 2 posts
AWS 13 posts
Banking 1 posts
Big Data 50 posts
Branch Analysis 1 posts
Capstone 206 posts
Career Education 7 posts
CLIP 1 posts
Community 72 posts
Congestion Zone 1 posts
Content Recommendation 1 posts
Cosine SImilarity 1 posts
Data Analysis 5 posts
Data Engineering 1 posts
Data Engineering 3 posts
Data Science 7 posts
Data Science News and Sharing 73 posts
Data Visualization 324 posts
Events 5 posts
Featured 37 posts
Function calling 1 posts
FutureTech 1 posts
Generative AI 5 posts
Hadoop 13 posts
Image Classification 1 posts
Innovation 2 posts
Kmeans Cluster 1 posts
LLM 6 posts
Machine Learning 364 posts
Marketing 1 posts
Meetup 144 posts
MLOPs 1 posts
Model Deployment 1 posts
Nagamas69 1 posts
NLP 1 posts
OpenAI 5 posts
OpenNYC Data 1 posts
pySpark 1 posts
Python 16 posts
Python 458 posts
Python data analysis 4 posts
Python Shiny 2 posts
R 404 posts
R Data Analysis 1 posts
R Shiny 560 posts
R Visualization 445 posts
RAG 1 posts
RoBERTa 1 posts
semantic rearch 2 posts
Spark 17 posts
SQL 1 posts
Streamlit 2 posts
Student Works 1687 posts
Tableau 12 posts
TensorFlow 3 posts
Traffic 1 posts
User Preference Modeling 1 posts
Vector database 2 posts
Web Scraping 483 posts
wukong138 1 posts

Our Recent Popular Posts

AI 4 AI: ChatGPT Unifies My Blog Posts
by Vinod Chugani
Dec 18, 2022
Meet Your Machine Learning Mentors: Kyle Gallatin
by Vivian Zhang
Nov 4, 2020
NICU Admissions and CCHD: Predicting Based on Data Analysis
by Paul Lee, Aron Berke, Bee Kim, Bettina Meier and Ira Villar
Jan 7, 2020

View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day ChatGPT citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay football gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income industry Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI

NYC Data Science Academy

NYC Data Science Academy teaches data science, trains companies and their employees to better profit from data, excels at big data project consulting, and connects trained Data Scientists to our industry.

NYC Data Science Academy is licensed by New York State Education Department.

Get detailed curriculum information about our
amazing bootcamp!

Please enter a valid email address
Sign up completed. Thank you!

Offerings

  • HOME
  • DATA SCIENCE BOOTCAMP
  • ONLINE DATA SCIENCE BOOTCAMP
  • Professional Development Courses
  • CORPORATE OFFERINGS
  • HIRING PARTNERS
  • About

  • About Us
  • Alumni
  • Blog
  • FAQ
  • Contact Us
  • Refund Policy
  • Join Us
  • SOCIAL MEDIA

    ยฉ 2025 NYC Data Science Academy
    All rights reserved. | Site Map
    Privacy Policy | Terms of Service
    Bootcamp Application