A robust safe reinforcement learning-based operation method for hybrid electric-hydrogen energy system risk-based dispatch considering dynamic efficiency characteristics of electrolysers

IF 9 1区 工程技术 Q1 ENERGY & FUELS
Jianbing Feng, Zhouyang Ren, Wenyuan Li
{"title":"A robust safe reinforcement learning-based operation method for hybrid electric-hydrogen energy system risk-based dispatch considering dynamic efficiency characteristics of electrolysers","authors":"Jianbing Feng,&nbsp;Zhouyang Ren,&nbsp;Wenyuan Li","doi":"10.1016/j.renene.2025.123761","DOIUrl":null,"url":null,"abstract":"<div><div>Hybrid electric-hydrogen energy systems hold transformative potential in achieving significant green energy transitions by leveraging complementary storage and flexibility. To safeguard operation against the variability of large-scale renewable generation, this paper formulates a risk-based dispatch for such systems that explicitly models the dynamic efficiency of electrolyzers. We propose a robust Soft Actor-Critic algorithm grounded in deep reinforcement learning to solve the resulting nonconvex, nonlinear, stochastic scheduling problem online, without resorting to simplifying approximations. A robust constrained Markov decision process framework is developed, which interprets constraint violations as an exploratory cost and uses the conditional value at risk of that cost to enforce a risk-averse policy. A novel second-order Bellman operator efficiently estimates this risk metric, while a primal-dual optimization scheme ensures maximum-entropy learning under safety constraints. Case studies on modified IEEE-118 and South Carolina 500-bus systems demonstrate that our approach converges 35.5 % faster and maintains superior constraint satisfaction compared to state-of-the-art deep reinforcement learning methods. Against traditional optimization-based methods, it reduces expected overloads by 21.9 %, peak overloads by 43.8 %, and improves overall computational efficiency by 99.994 %.</div></div>","PeriodicalId":419,"journal":{"name":"Renewable Energy","volume":"254 ","pages":"Article 123761"},"PeriodicalIF":9.0000,"publicationDate":"2025-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Renewable Energy","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0960148125014235","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENERGY & FUELS","Score":null,"Total":0}
引用次数: 0

Abstract

Hybrid electric-hydrogen energy systems hold transformative potential in achieving significant green energy transitions by leveraging complementary storage and flexibility. To safeguard operation against the variability of large-scale renewable generation, this paper formulates a risk-based dispatch for such systems that explicitly models the dynamic efficiency of electrolyzers. We propose a robust Soft Actor-Critic algorithm grounded in deep reinforcement learning to solve the resulting nonconvex, nonlinear, stochastic scheduling problem online, without resorting to simplifying approximations. A robust constrained Markov decision process framework is developed, which interprets constraint violations as an exploratory cost and uses the conditional value at risk of that cost to enforce a risk-averse policy. A novel second-order Bellman operator efficiently estimates this risk metric, while a primal-dual optimization scheme ensures maximum-entropy learning under safety constraints. Case studies on modified IEEE-118 and South Carolina 500-bus systems demonstrate that our approach converges 35.5 % faster and maintains superior constraint satisfaction compared to state-of-the-art deep reinforcement learning methods. Against traditional optimization-based methods, it reduces expected overloads by 21.9 %, peak overloads by 43.8 %, and improves overall computational efficiency by 99.994 %.
考虑电解槽动态效率特性的电氢混合动力系统风险调度鲁棒安全强化学习方法
通过利用互补存储和灵活性,混合电力-氢能源系统在实现重大绿色能源转型方面具有变革性潜力。为了防止大规模可再生能源发电的可变性,本文为这类系统制定了一个基于风险的调度,该调度明确地模拟了电解槽的动态效率。我们提出了一种基于深度强化学习的鲁棒软Actor-Critic算法,以在线解决由此产生的非凸、非线性、随机调度问题,而无需简化近似。开发了一个鲁棒约束马尔可夫决策过程框架,该框架将违反约束解释为探索成本,并使用该成本风险下的条件值来执行风险规避策略。一种新的二阶Bellman算子有效地估计了这种风险度量,而一种原始对偶优化方案确保了在安全约束下的最大熵学习。对改进的IEEE-118和南卡罗来纳州500总线系统的案例研究表明,与最先进的深度强化学习方法相比,我们的方法收敛速度提高了35.5%,并保持了更高的约束满意度。与传统的基于优化的方法相比,期望过载减少21.9%,峰值过载减少43.8%,整体计算效率提高99.994%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Renewable Energy
Renewable Energy 工程技术-能源与燃料
CiteScore
18.40
自引率
9.20%
发文量
1955
审稿时长
6.6 months
期刊介绍: Renewable Energy journal is dedicated to advancing knowledge and disseminating insights on various topics and technologies within renewable energy systems and components. Our mission is to support researchers, engineers, economists, manufacturers, NGOs, associations, and societies in staying updated on new developments in their respective fields and applying alternative energy solutions to current practices. As an international, multidisciplinary journal in renewable energy engineering and research, we strive to be a premier peer-reviewed platform and a trusted source of original research and reviews in the field of renewable energy. Join us in our endeavor to drive innovation and progress in sustainable energy solutions.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信