A robust safe reinforcement learning-based operation method for hybrid electric-hydrogen energy system risk-based dispatch considering dynamic efficiency characteristics of electrolysers

IF 9 1区工程技术 Q1 ENERGY & FUELS

Renewable Energy Pub Date : 2025-06-14 DOI:10.1016/j.renene.2025.123761

Jianbing Feng, Zhouyang Ren, Wenyuan Li

{"title":"A robust safe reinforcement learning-based operation method for hybrid electric-hydrogen energy system risk-based dispatch considering dynamic efficiency characteristics of electrolysers","authors":"Jianbing Feng, Zhouyang Ren, Wenyuan Li","doi":"10.1016/j.renene.2025.123761","DOIUrl":null,"url":null,"abstract":"<div><div>Hybrid electric-hydrogen energy systems hold transformative potential in achieving significant green energy transitions by leveraging complementary storage and flexibility. To safeguard operation against the variability of large-scale renewable generation, this paper formulates a risk-based dispatch for such systems that explicitly models the dynamic efficiency of electrolyzers. We propose a robust Soft Actor-Critic algorithm grounded in deep reinforcement learning to solve the resulting nonconvex, nonlinear, stochastic scheduling problem online, without resorting to simplifying approximations. A robust constrained Markov decision process framework is developed, which interprets constraint violations as an exploratory cost and uses the conditional value at risk of that cost to enforce a risk-averse policy. A novel second-order Bellman operator efficiently estimates this risk metric, while a primal-dual optimization scheme ensures maximum-entropy learning under safety constraints. Case studies on modified IEEE-118 and South Carolina 500-bus systems demonstrate that our approach converges 35.5 % faster and maintains superior constraint satisfaction compared to state-of-the-art deep reinforcement learning methods. Against traditional optimization-based methods, it reduces expected overloads by 21.9 %, peak overloads by 43.8 %, and improves overall computational efficiency by 99.994 %.</div></div>","PeriodicalId":419,"journal":{"name":"Renewable Energy","volume":"254 ","pages":"Article 123761"},"PeriodicalIF":9.0000,"publicationDate":"2025-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Renewable Energy","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0960148125014235","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENERGY & FUELS","Score":null,"Total":0}

引用次数: 0

Abstract

Hybrid electric-hydrogen energy systems hold transformative potential in achieving significant green energy transitions by leveraging complementary storage and flexibility. To safeguard operation against the variability of large-scale renewable generation, this paper formulates a risk-based dispatch for such systems that explicitly models the dynamic efficiency of electrolyzers. We propose a robust Soft Actor-Critic algorithm grounded in deep reinforcement learning to solve the resulting nonconvex, nonlinear, stochastic scheduling problem online, without resorting to simplifying approximations. A robust constrained Markov decision process framework is developed, which interprets constraint violations as an exploratory cost and uses the conditional value at risk of that cost to enforce a risk-averse policy. A novel second-order Bellman operator efficiently estimates this risk metric, while a primal-dual optimization scheme ensures maximum-entropy learning under safety constraints. Case studies on modified IEEE-118 and South Carolina 500-bus systems demonstrate that our approach converges 35.5 % faster and maintains superior constraint satisfaction compared to state-of-the-art deep reinforcement learning methods. Against traditional optimization-based methods, it reduces expected overloads by 21.9 %, peak overloads by 43.8 %, and improves overall computational efficiency by 99.994 %.

查看原文本刊更多论文

考虑电解槽动态效率特性的电氢混合动力系统风险调度鲁棒安全强化学习方法

通过利用互补存储和灵活性，混合电力-氢能源系统在实现重大绿色能源转型方面具有变革性潜力。为了防止大规模可再生能源发电的可变性，本文为这类系统制定了一个基于风险的调度，该调度明确地模拟了电解槽的动态效率。我们提出了一种基于深度强化学习的鲁棒软Actor-Critic算法，以在线解决由此产生的非凸、非线性、随机调度问题，而无需简化近似。开发了一个鲁棒约束马尔可夫决策过程框架，该框架将违反约束解释为探索成本，并使用该成本风险下的条件值来执行风险规避策略。一种新的二阶Bellman算子有效地估计了这种风险度量，而一种原始对偶优化方案确保了在安全约束下的最大熵学习。对改进的IEEE-118和南卡罗来纳州500总线系统的案例研究表明，与最先进的深度强化学习方法相比，我们的方法收敛速度提高了35.5%，并保持了更高的约束满意度。与传统的基于优化的方法相比，期望过载减少21.9%，峰值过载减少43.8%，整体计算效率提高99.994%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Renewable Energy 工程技术-能源与燃料

CiteScore

18.40

自引率

9.20%

发文量

1955

审稿时长

6.6 months

期刊介绍： Renewable Energy journal is dedicated to advancing knowledge and disseminating insights on various topics and technologies within renewable energy systems and components. Our mission is to support researchers, engineers, economists, manufacturers, NGOs, associations, and societies in staying updated on new developments in their respective fields and applying alternative energy solutions to current practices. As an international, multidisciplinary journal in renewable energy engineering and research, we strive to be a premier peer-reviewed platform and a trusted source of original research and reviews in the field of renewable energy. Join us in our endeavor to drive innovation and progress in sustainable energy solutions.