Data Study on Electric Vehicles

Posted on Nov 28, 2016
The skills the author demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.
Contributed by Wann-Jiun Ma. He is currently attending the NYC Data Science Academy Online Data Science Bootcamp program. This post is based on his second class project - Data Analysis and Visualization with Shiny.


As data shows electric vehicles (EVs) will be widespread in coming future, it is interesting to examine the impact of large-scale integration of  EVs on power grids. The owners of EVs need to charge their vehicles on a daily basis to ensure that there is enough power when they drive to work, which may introduce a huge electricity demand to the power distribution company.

It is interesting to see what may happen to power grids if there are millions of people charging their EVs during the same time period. If power plants cannot provide enough electricity to meet the demand of so many EVs at the same time, can people schedule their EVs in a way so that power plants won't reach capacity limitations? One main concern when people consider buying EV is to save fuel cost. But do we need to pay more for charging an EV? Can we design a charging strategy to save charging cost? There are so many interesting questions to answer. Let's start our analysis!

Data Analysis and Visualization with Shiny

First, we summarize the locations of the charging stations in the States to have a big picture. The data is downloaded from Open Charge Map Open Charge Map provides an API for users to consume and contribute data. We use the API provided by Open Charge Map to retrieve charging location information. We agree with the license term listed at The data format is JSON and we parse the data to extract location information including latitude, longitude, address, etc. After data wrangling, we store the information in CSV format for later use. All codes can be found at

The interactive map is built by leaflet. The map will group charging stations based on their locations. If we want to find the address of a charging station, we can just zoom in on the map to locate a particular charging station. Within renderLeaflet, we also plot the numbers of stations in major cities.

Data Study on Electric Vehicles

Data Study on Electric Vehicles

It looks like the number of charging stations in LA is about twice more than that in NYC! Austin (TX) has more charging stations than NYC! For such a much smaller city, residents of Austin seem to like using EVs much more!

Impact on Power load

Now, let us examine the impact of large-scale integration of  EVs on total power load. We download the New York State real-time load data from NYISO's website According to Wikipedia: ``An independent system operator (ISO) is an organization formed at the recommendation of the Federal Energy Regulatory Commission. In the areas where an ISO is established, it coordinates, controls and monitors the operation of the electrical power system, usually within a single US State, but sometimes encompassing multiple states." NYISO governs the operation of the electrical power system in New York State.

Data Study on Electric Vehicles

Data Study on Electric Vehicles

We build an interactive plot using Shiny to show the aggregation of the EV load and base load. The base load is the power load without EV charging load. The time interval is 5 minutes. There are a total of 288 intervals within a day. We can use the shiny sliderInput to select different numbers of EVs needed to be charged in NYC. Each EV requires about 36 kW power to charge. The interactive plot shows the aggregation of the EVs and the base loads. To design a smart charging strategy, people should charge their EVs when the electricity price is low.

Price Data

So, let us visualize the real-time price data. We plot the real-time electricity price as a function of time of day (5-minute interval). The real-time price data is downloaded from The data format is JSON and we parse the data to extract price information as a function of time. After data parsing, we store the information in CSV format for later use.


Generally speaking, the electricity price is relatively low from midnight to 3:00 AM, which may be the best time to charge your EV. It is interesting to see that there are two price peaks during the time period from the 200-th to the 250-th time intervals (4:00PM to 9:00PM), which indicates that the power load is maximized in the evening and so is the real-time price. Thus, you probably don't want to charge your EV when you come home after work!

Using the interactive plot, we can visualize the aggregation of the EV charging and power loads for different numbers of EVs and the variation of the real-time electricity price. Apparently, we should charge our EVs when the price is low, i.e., around midnight. Can a utility company design an incentive charging strategy to incentivize customers to charge their EVs during a certain period of time? In the following, we propose such machine learning algorithm to schedule EV charging to reduce charging cost and improve power operation efficiency.

Charging Control of EVs Using Online Learning

Our idea is to use  the flexible load capability offered by EVs owned by residential customers. Large-scale integration of EVs may impose a significant burden on the grid, leading to effects such as the creation of new peaks, peak load amplification and voltage deviations. To cope with these issues, many algorithms have been proposed to schedule the charging of EVs. In our formulation, we model the distribution power company and every EV customer as an individual decision maker who wishes to optimize his own utility function. For the distribution power company, the payoff is maximized if the total load profile over a day is valley-filling.


A non-valley filling total load profile (left) may overload the power plants at peak hours. On the other hand, a valley filling load profile (right) does not overload power plants at peak hours. Distribution power company does not want to overload power plants, nor does want a non-valley-filling total load profile. On the other hand, for the EV customer, the utility function is maximized if the cost to charge the EV over a day is minimized. By designing a suitable pricing policy, the distribution company aims at ensuring that the aggregate charging profile adopted by the customers is valley-filling.

Distributed Charging Control Algorithm

Our distributed charging control algorithm is based on an online learning and online convex optimization framework. The online learning framework has now assumed tremendous popularity in the online convex optimization and machine learning communities. We use a regret minimization algorithm in the online learning framework. The regret minimization algorithm uses the regret as the performance measure and provides an iterative way for every decision maker to update its policy such that, at convergence, the policy is optimal in a suitably defined sense.

In particular, the objective of the distribution company is to achieve a total load profile that is valley-filling while ensuring that both the inflexible base load and the schedulable EVs are supplied with the required amount of energy. Thus, it wishes to obtain the aggregated charging profile that solves the optimization problem described below.


To incentivize the customers to choose charging profiles that in aggregate minimize the cost, the distribution company designs suitable pricing profiles for the energy being supplied to the EVs. Every EV customer fixes the charging schedule at the beginning of the day based on the information about its own constraints and any information provided by the distribution company.

Charging Cost

A price-sensitive EV customer seeks to minimize the total cost of charging by suitably shaping its charging schedule. Thus each customer wants to solve the optimization problem described below.


We adopt the optimistic mirror descent (OMD) algorithm to generate the charging profile update which minimizes the regret. On each day, the regret minimization algorithm generates the charging profile update without knowing the current objective function (and its gradient). Specifically, the OMD algorithm iteratively applies the updates as follows.


By implementing our charging control algorithm, we can show that both distribution company and customer regrets will converge to zeros. The following figure summarizes the workflow of our algorithm. Assume that the distribution company and its customers want to schedule EV charging on Wednesday, then


We consider a toy example with 20 EVs. Figure below shows that with appropriately scheduling the EV charging, a valley-filling load profile is obtained.



We have presented a novel framework for distributed charging control of EVs using online learning with Shiny for interactive data visualization. The proposed algorithm can be implemented without low-latency two-way communication between the distribution company and the EV customers, which fits in with the current communication infrastructure and protocols used in the smart grid. For the reader who is interested in the math behind our algorithm, one can find the details in our journal paper at my website.

About Author

Wann-Jiun Ma

Wann-Jiun Ma (PhD Electrical Engineering) is a Postdoctoral Associate at Duke University. His research is focused on mathematical modeling, algorithm design, and software/experiment implementation for large-scale systems such as wireless sensor networks and energy analytics. After having exposed...
View all posts by Wann-Jiun Ma >

Related Articles

Leave a Comment

No comments found.

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI