Energies, Vol. 19, Pages 1046: Reinforcement Learning Methods for the Stochastic Optimal Control of an Industrial Power-to-Heat System

Energies, Vol. 19, Pages 1046: Reinforcement Learning Methods for the Stochastic Optimal Control of an Industrial Power-to-Heat System

Energies doi: 10.3390/en19041046

Authors:
Eric Pilling
Martin Bähr
Ralf Wunderlich

The optimal control of sustainable energy supply systems, including renewable energies and energy storage, takes a central role in the decarbonization of industrial systems. However, the use of fluctuating renewable energies leads to fluctuations in energy generation and requires a suitable control strategy for the complex systems in order to ensure energy supply. In this paper, we consider an electrified power-to-heat system which is designed to supply heat in the form of superheated steam for industrial processes. The system consists of a high-temperature heat pump for heat supply, a wind turbine for power generation, a sensible thermal energy storage for storing excess heat, and a steam generator for providing steam. If the system’s energy demand cannot be covered by electricity from the wind turbine, additional electricity must be purchased from the power grid. For this system, we investigate the cost-optimal operation, aiming to minimize the electricity cost from the grid by a suitable system control depending on the available wind power and the amount of stored thermal energy. This is a decision-making problem under uncertainty regarding the future prices for electricity from the grid and the future generation of wind power. The resulting stochastic optimal control problem is treated as finite-horizon Markov decision process for a multi-dimensional controlled state process. We first consider the classical backward recursion technique for solving the associated dynamic programming equation for the value function and compute the optimal decision rule. Since that approach suffers from the curse of dimensionality, we also apply reinforcement learning techniques, namely Q-learning, that are able to provide a good approximate solution to the optimization problem within reasonable time.

More From Author

Energies, Vol. 19, Pages 1047: Probabilistic Voltage Stability Screening Under Stochastic Load Allocation at Weak Buses Using Stability Index

The Derivation of Phase-Space Metric in a Geometric Quantization Approach: General Relativity with Quantized Phase-Space Metric and Relative Spacetime

Leave a Reply

Your email address will not be published. Required fields are marked *