Efficient optimal power flow learning: A deep reinforcement learning with physics-driven critic model

IF 5 2区工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC

International Journal of Electrical Power & Energy Systems Pub Date : 2025-03-27 DOI:10.1016/j.ijepes.2025.110621

Ahmed Sayed , Khaled Al Jaafari , Xian Zhang , Hatem Zeineldin , Ahmed Al-Durra , Guibin Wang , Ehab Elsaadany

{"title":"Efficient optimal power flow learning: A deep reinforcement learning with physics-driven critic model","authors":"Ahmed Sayed , Khaled Al Jaafari , Xian Zhang , Hatem Zeineldin , Ahmed Al-Durra , Guibin Wang , Ehab Elsaadany","doi":"10.1016/j.ijepes.2025.110621","DOIUrl":null,"url":null,"abstract":"<div><div>The transition to decarbonized energy systems presents significant operational challenges due to increased uncertainties and complex dynamics. Deep reinforcement learning (DRL) has emerged as a powerful tool for optimizing power system operations. However, most existing DRL approaches rely on approximated data-driven critic networks, requiring numerous risky interactions to explore the environment and often facing estimation errors. To address these limitations, this paper proposes an efficient DRL algorithm with a physics-driven critic model, namely a differentiable holomorphic embedding load flow model (D-HELM). This approach enables accurate policy gradient computation through a differentiable loss function based on system states of realized uncertainties, simplifying both the replay buffer and the learning process. By leveraging continuation power flow principles, D-HELM ensures operable, feasible solutions while accelerating gradient steps through simple matrix operations. Simulation results across various test systems demonstrate the computational superiority of the proposed approach, outperforming state-of-the-art DRL algorithms during training and model-based solvers in online operations. This work represents a potential breakthrough in real-time energy system operations, with extensions to security-constrained decision-making, voltage control, unit commitment, and multi-energy systems.</div></div>","PeriodicalId":50326,"journal":{"name":"International Journal of Electrical Power & Energy Systems","volume":"167 ","pages":"Article 110621"},"PeriodicalIF":5.0000,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Electrical Power & Energy Systems","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0142061525001723","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

The transition to decarbonized energy systems presents significant operational challenges due to increased uncertainties and complex dynamics. Deep reinforcement learning (DRL) has emerged as a powerful tool for optimizing power system operations. However, most existing DRL approaches rely on approximated data-driven critic networks, requiring numerous risky interactions to explore the environment and often facing estimation errors. To address these limitations, this paper proposes an efficient DRL algorithm with a physics-driven critic model, namely a differentiable holomorphic embedding load flow model (D-HELM). This approach enables accurate policy gradient computation through a differentiable loss function based on system states of realized uncertainties, simplifying both the replay buffer and the learning process. By leveraging continuation power flow principles, D-HELM ensures operable, feasible solutions while accelerating gradient steps through simple matrix operations. Simulation results across various test systems demonstrate the computational superiority of the proposed approach, outperforming state-of-the-art DRL algorithms during training and model-based solvers in online operations. This work represents a potential breakthrough in real-time energy system operations, with extensions to security-constrained decision-making, voltage control, unit commitment, and multi-energy systems.

查看原文本刊更多论文

高效最优潮流学习：物理驱动的深度强化学习模型

由于不确定性的增加和复杂的动态，向脱碳能源系统的过渡带来了重大的运营挑战。深度强化学习（DRL）已成为优化电力系统运行的有力工具。然而，大多数现有的DRL方法依赖于近似的数据驱动的评论家网络，需要许多有风险的交互来探索环境，并且经常面临估计错误。为了解决这些限制，本文提出了一种有效的DRL算法，该算法具有物理驱动的临界模型，即可微全纯嵌入潮流模型（D-HELM）。该方法通过基于已实现不确定性的系统状态的可微损失函数实现精确的策略梯度计算，简化了重播缓冲和学习过程。通过利用持续的潮流原理，D-HELM确保了可操作的，可行的解决方案，同时通过简单的矩阵运算加速梯度步骤。各种测试系统的仿真结果证明了所提出方法的计算优势，在训练期间优于最先进的DRL算法，在在线操作中优于基于模型的求解器。这项工作代表了实时能源系统运行的潜在突破，扩展到安全约束决策、电压控制、机组承诺和多能源系统。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Journal of Electrical Power & Energy Systems 工程技术-工程：电子与电气

CiteScore

12.10

自引率

17.30%

发文量

1022

审稿时长

51 days

期刊介绍： The journal covers theoretical developments in electrical power and energy systems and their applications. The coverage embraces: generation and network planning; reliability; long and short term operation; expert systems; neural networks; object oriented systems; system control centres; database and information systems; stock and parameter estimation; system security and adequacy; network theory, modelling and computation; small and large system dynamics; dynamic model identification; on-line control including load and switching control; protection; distribution systems; energy economics; impact of non-conventional systems; and man-machine interfaces. As well as original research papers, the journal publishes short contributions, book reviews and conference reports. All papers are peer-reviewed by at least two referees.