Energy Savings in
the Data Center by Smart Controls on Load-Balancing
Daniel Nissen nisse079
Senior Citizen Education Program
College of Continuing and Professional Studies
University of Minnesota
Major: Undeclared
BBE2201
2018 April 24
This
paper reviews the literature on data center energy savings from redirecting
services between data centers in different geographical locations. The studies reviewed discuss the decision
making process for allocating computational load to data centers, and the resulting
energy savings.
Executive Summary
Introduction
Data centers use energy to compute, ventilate, light and for
other activities. They consume around 70
billion kWh per year or around 2% of the total energy use in the US. The energy use stabilized over 2009-2014 as
technology in the form of new chips, new storage devices, and virtual machines was
introduced. The large companies in this
space have introduced renewable energy in several forms, but still consume
considerable amounts of electricity from the grid. The studies reviewed in this paper are
focused on using more renewable energy by more intelligent assignment of
computational tasks to data centers, and on making the prediction of how much
energy from renewable sources will be available to do computational tasks in
data centers.
Discussion
Many cloud service providers operate multiple data centers
in separate geographical locations, and can move work between them at the speed
of light. If you choose carefully, you
can allocate more work to the data centers with available renewable energy and
less to data centers working with brown electricity (electricity from coal or
natural gas, typically). Data centers
are usually overprovisioned with servers, so some of them can be asleep and not
using much energy. When needed they can
be quickly awakened and the data needed for processing requests can be sent to
them. The studies reviewed here start by
defining algorithms to decide how much renewable energy is likely to be
available in each data center and then which tasks to assign to which data
center. Also, there are studies on how
to predict wind and solar power availability that can be fed into the
prediction models in the other studies.
Future
predictions
Data center management will become more intelligent and
complex as more non-dispatchable energy is integrated into the grid and
installed by data center operators. The
cost savings can be considerable and visible.
Also, greenhouse gas reduction is becoming more popular and driving more
decisions on which cloud provider to use.
Introduction and Background
The cloud computing revolution has led to data centers that
use large amounts of energy, about 2% of all energy use in the US (Sverdlik,
Y., 2016; Lawrence Berkeley National Laboratory, 2016). The energy use stabilized from 2010 to 2014
(latest data) at about 70 billion kWh with improved efficiency balancing the
large growth in computational capacity.
A small part of this energy actually is used by the chips to do
computing. Lighting uses even more
energy in the data center. Only about
14% of the capacity of most servers is actually used to do the computations (E.
Source Companies, LLC, 2016). This
percentage is going up with better algorithms to do load-balancing and improved
use of virtual machines and containers to move work around and on to fewer
servers.
There are a lot of measures being implemented by data center
operators to reduce energy use. All
energy used by the server computers is eventually converted to heat, so any
reduction in server energy use is reflected in a reduction in cooling
costs. For every 1 kWh the servers need,
about 0.6 kWh is needed to cool them (E. Source Companies, LLC, 2016).
Google and Apple have both announced that they have
Power Purchase Agreements or own their own renewable energy sources to power
their needs for electricity (Google, Inc., n. d.; Sverdlik, Y, 2013). But this does not mean they are always using
renewable energy, only that the total capacity of renewables they have
available matches or exceeds their requirements over time, maybe over a month
or more. Due to the non-dispatchable
nature of renewable energy sources, they are using grid power at times that comes
from other sources. The research I am
reviewing here leads to more use of renewable sources by moving work to
locations more likely to have currently available renewable energy capacity.
The stabilization of the energy use by data centers
in an era of increasing computational load is driven by improved
efficiency. A review of some commonly
used techniques that are well known in the field may be worthy. The use of energy by a server in a data
center is relatively constant while it is powered up and running, whether it is
actually performing user’s work or not.
Servers can be put to sleep when not needed, which significantly reduces
the energy usage. Disk drives can be
spun down and up, and there are studies of how to position data on the drives
to allow more time in the spun down state, significantly reducing energy
use. There is much literature on
consolidating applications on fewer, busier servers, using virtual machines and
containers to allow multiple application systems to coexist in a single hardware
box. Another way to reduce energy use
may be to better control the temperature of the server components and allow the
ambient temperatures around the server to rise. Another way to reduce energy
use is to use alternative cooling methods, like locating the data center is a
cold climate or using geothermal cooling.
All of these techniques reduce overall energy use in the data center and
many are pretty standard in most data centers now (E. Source Companies LLC, 2016).
The cost of energy for data centers varies widely
due to different energy sources, and the availability of natural cooling due to
a colder climate. The marginal cost of
energy may be significantly higher than the base load if that energy comes from
the grid instead of local renewable sources (solar, geothermal or wind) (E.
Source Companies LLC., 2016).
Availability of renewable energy varies
significantly between locations due to amount of installed capacity and the
weather, even over relatively short distances between data centers (Toosi, A.
N., Qu, C., Assunção, M. D., & Buyya, R, 2017).
Discussion of the Issue
One goal toward reducing our carbon footprint is to maximize the use of
renewable energy and minimize the use of fossil fuel energy. The other major driving force in looking at
energy use is cost.
The research I’m focusing on discusses various
methods for moving workloads between geographical areas to increase the
percentage of power used that comes from renewable sources. Some of this research explores the use of
forecasting methods to predict the availability of wind and/or solar power in
each data center location over the next time period. This predictability would allow the data
center cluster to overprovision servers, and decide which servers to put to
sleep and which to use for the computational load over the next time
period. Another approach monitors the
availability of renewable power and periodically adjusts the load to the
availability, in a reactive manner. The research on the models is either based
on simulations or actual data captured from data centers. None of these papers discusses an actual
implementation in real data centers.
Predicting the availability of wind power is for
the most part predicting the speed of the wind in a particular region. And
predicting the availability of solar power is predicting the cloudiness in a
region, as well as knowing sunrise and sundown times in the region.
Using a reactive algorithm requires access to a
source of renewable power availability (in watts) and knowledge of how many
watts are needed to handle a particular load.
Khosravi, A., & Buyya, R. (2017) use
a Gaussian Mixture Model (GMM) to predict short-term availability of renewable
energy. This model uses current and previous availability figures for renewable
energy to predict the future availability.
The researchers of this study did not implement a test system. Instead they rely on historical data from the
National Renewable Energy Laboratory (NREL) and workload demand from Amazon Web
Services (AWS) to train their model. The
GMM model was able to predict up to 15 minutes with a 98% chance of getting
within +/- 10% of the actual values.
Toosi, A. N., Qu, C.,
Assunção, M. D., & Buyya, R. (2017) describe another reactive model for
load-balancing between data centers.
These researchers built a model for the allocation of workload to data
centers with available renewable power, and used it in a real environment (Grid'5000
in France), as well as a simulation using traffic traces for English
Wikipedia. This model is an online
algorithm, which means they do not know the future demand, or availability of
renewable energy. They introduce a
global load balancer that routes requests to data centers where local
load-balancers assign requests to individual servers. Auto-scalers control whether to put servers
to sleep or wake them up. The Green Load
Balancing (GreenLB) Policy is pretty straight forward. Looking at available renewable energy and the
price of brown energy at each data center, and thresholds of load the
datacenters can each handle, they assign work to the lowest priced data center,
treating renewable energy as free energy.
They always allow at least one server at each data center to be used,
and so they may use some brown energy for that one server while other data
centers still have available renewable energy.
Comparing this algorithm to the usual round-robin algorithm and another
optimization technique, this study showed that this algorithm reduced the cost
by 22% and 8% and brown energy by 17% and 8%, respectively.
The major issue with
these algorithms is how to optimize the prediction model. Abedinia, O., Amjady, N., & Ghadimi, N.
(2017) focus on predicting solar energy availability with a neural network
approach. They start by identifying the
output of their algorithm, which is the predicted generation level of the
photovoltaic cells. The candidate inputs
include solar radiation, temperature and photovoltaic (PV) generation each hour
for the last 24 hours. This is too many
attributes for the neural network, so they use a two-stage feature selection
method, focused on removing redundant and irrelevant inputs. There is a complex set of 3 levels of neural
network that process the inputs into the output. The first neural network extracts a mapping
function from the inputs, giving the resulting weights and output variable
forecast as input to the second neural network. The second neural network does
basically the same thing as the first, but with more precision, as its input is
more precise. The third neural network
is similar. The authors say you can
continue this cascade but there is a limit on computational resources and a
narrowing of the differences being detected, so they stopped at 3 neural
networks. These neural networks need to
be trained, and the study used the Levenberg-Marquardt learning algorithm. The
Shark Smell Optimization (SSO) is the unique algorithm introduced by these
authors which is based on how sharks find their prey. The algorithm tries to find increasingly
dense odors, in this case increasingly consistent forecasts. A major issue with neural networks is they
trap at local minima, so this algorithm is designed to avoid that. The end result of their neural network
cascade is a significant reduction in the variance (normalized mean absolute
percentage error and normalized root mean square error) over 9 other prediction
methods.
Cheng,
W. Y., Liu, Y., Bourgeois, A. J., Wu, Y., & Haupt, S. E. (2017) studied
wind speed prediction and found that 0 to 3 hour forecasts mean absolute error
could be improved by 30-40% by integrating anemometer data from the wind
turbines into the prediction model.
Typically, wind turbines already include anemometers to allow them to
adapt to conditions, like shutdown when the wind speed is too fast. They started with a numerical weather
prediction model, the Real-Time Four Dimensional Data Assimilation System
(NCAR-ATEC RTFDDA or RTFDDA), and added the current wind speed and a calculated
direction to it. This was a short (6-day) study and it needs expansion and more
study.
While the studies I
have cited here are not definitive studies that show exactly how to build new
systems to balance loads on data centers to use renewable power whenever
possible, they combine to show progress toward that goal. The overall meaning of these studies is that
we can save significant energy and reduce costs by optimizing the usage of
servers in data centers with available renewable energy on a 10 to 15 minute
cycle.
Future Predictions
We will continue to need to get smarter in
order to manage the grid with renewable non-dispatchable energy and to reduce
the cost of data center computation. The technology and processes to allow work
to be moved between data centers and within data centers is well developed but
can be continuously tuned and improved.
More data center operators are building renewable energy into their
plans for future implementations. The
grid is also getting more renewable energy content all the time. Future data center servers will continue to
do more computation with less power.
Overall, of course, data centers will probably continue to use more
energy, as we become more dependent on cloud services. These optimization techniques are unlikely to
improve faster than the growth of computational requirements.
Bibliography
Abedinia, O., Amjady, N., & Ghadimi, N. (2017). Solar energy
forecasting based on hybrid neural network and improved metaheuristic
algorithm. Computational Intelligence,34(1), 241-260.
doi:10.1111/coin.12145
Cheng, W. Y., Liu, Y., Bourgeois, A. J., Wu, Y., & Haupt, S.
E. (2017). Short-term wind forecast of a data assimilation/weather forecasting
system with wind turbine anemometer measurement assimilation. Renewable Energy,
107, 340-351. doi:10.1016/j.renene.2017.02.014
E. Source
Companies LLC. (2016, January 21). Managing Energy Costs in Data Centers. Retrieved
April 17, 2018, from https://ugi.bizenergyadvisor.com/data-centers
Google. Inc. (n. d.). Renewable energy – Data Centers – Google.
Retrieved April 16, 2018, from
https://www.google.com/about/datacenters/renewable/
Khosravi, A., & Buyya, R. (2017). Short-Term Prediction
Model to Maximize Renewable Energy Usage in Cloud Data Centers. Sustainable
Cloud and Energy Services,203-218. doi:10.1007/978-3-319-62238-5_8
Lawrence
Berkeley National Laboratory. (2016, June). United States Data Center Energy
Usage Report. Retrieved April 16, 2018, from https://eta.lbl.gov/publications/united-states-data-center-energy
Sverdlik, Y. (2013, March 21). Apple reaches 100% renewable
energy across all data centers. Retrieved April 16, 2018, from
http://www.datacenterdynamics.com/content-tracks/design-build/apple-reaches-100-renewable-energy-across-all-data-centers/74708.fullarticle
Sverdlik,
Y. (2016, June 27). Here's How Much Energy All US Data Centers Consume.
Retrieved April 16, 2018, from http://www.datacenterknowledge.com/archives/2016/06/27/heres-how-much-energy-all-us-data-centers-consume
Toosi, A. N., Qu, C., Assunção, M. D., & Buyya, R. (2017).
Renewable-aware geographical load balancing of web applications for sustainable
data centers. Journal of Network and Computer Applications,83,
155-168. doi:10.1016/j.jnca.2017.01.036
No comments:
Post a Comment