核驱动微电网经济调度的深度强化学习方法综合评价

IF 4.9 3区计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Computers & Electrical Engineering Pub Date : 2025-06-23 DOI:10.1016/j.compeleceng.2025.110528

Athanasios Ioannis Arvanitidis , Paul Talbot , Nikolaos Gatsis , Miltiadis Alamaniotis

{"title":"核驱动微电网经济调度的深度强化学习方法综合评价","authors":"Athanasios Ioannis Arvanitidis , Paul Talbot , Nikolaos Gatsis , Miltiadis Alamaniotis","doi":"10.1016/j.compeleceng.2025.110528","DOIUrl":null,"url":null,"abstract":"<div><div>As the electrical grid integrates more variable renewable energy sources such as wind and solar, the demand for distributed and flexible systems to address this increased variability becomes critical. Nuclear-driven microgrids provide a promising solution by offering stable generation to complement intermittent renewables, ensuring grid reliability and operating efficiency. This paper proposes a recurrent deep reinforcement learning framework for optimal economic dispatch in a nuclear-powered microgrid integrating renewable energy sources, small modular reactors, battery storage systems, and balance-of-plant dynamics. A three-agent control architecture is developed, where demand and renewable energy agents act as forecasters, and a reinforcement learning-based dispatch agent performs real-time energy allocation. A nonlinear programming formulation is first used to generate an optimal baseline for benchmarking. The proposed dispatch controller, based on Proximal Policy Optimization enhanced with Long Short-Term Memory networks, exploits temporal correlations in system dynamics by taking advantage of the time series used as inputs to improve policy robustness under uncertainty. Comparative analysis against established deep reinforcement learning methods, including Proximal Policy Optimization with a feedforward architecture, Soft Actor-Critic, and Twin Delayed Deep Deterministic Policy Gradient, demonstrates superior performance. Numerical results indicate that the proposed controller achieves a 0.39% cost reduction relative to the nonlinear programming benchmark and outperforms other learning-based methods by generating additional revenue of up to 0.35%. All reinforcement learning controllers compute dispatch actions in less than 0.3 s, resulting in a computational speedup of more than three orders of magnitude over the nonlinear programming baseline. The findings of this paper highlight their applicability for real-time operation and control in nuclear-integrated microgrids under volatile operating conditions.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"126 ","pages":"Article 110528"},"PeriodicalIF":4.9000,"publicationDate":"2025-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Comprehensive assessment of deep reinforcement learning approaches for economic dispatch in nuclear-driven microgrids\",\"authors\":\"Athanasios Ioannis Arvanitidis , Paul Talbot , Nikolaos Gatsis , Miltiadis Alamaniotis\",\"doi\":\"10.1016/j.compeleceng.2025.110528\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>As the electrical grid integrates more variable renewable energy sources such as wind and solar, the demand for distributed and flexible systems to address this increased variability becomes critical. Nuclear-driven microgrids provide a promising solution by offering stable generation to complement intermittent renewables, ensuring grid reliability and operating efficiency. This paper proposes a recurrent deep reinforcement learning framework for optimal economic dispatch in a nuclear-powered microgrid integrating renewable energy sources, small modular reactors, battery storage systems, and balance-of-plant dynamics. A three-agent control architecture is developed, where demand and renewable energy agents act as forecasters, and a reinforcement learning-based dispatch agent performs real-time energy allocation. A nonlinear programming formulation is first used to generate an optimal baseline for benchmarking. The proposed dispatch controller, based on Proximal Policy Optimization enhanced with Long Short-Term Memory networks, exploits temporal correlations in system dynamics by taking advantage of the time series used as inputs to improve policy robustness under uncertainty. Comparative analysis against established deep reinforcement learning methods, including Proximal Policy Optimization with a feedforward architecture, Soft Actor-Critic, and Twin Delayed Deep Deterministic Policy Gradient, demonstrates superior performance. Numerical results indicate that the proposed controller achieves a 0.39% cost reduction relative to the nonlinear programming benchmark and outperforms other learning-based methods by generating additional revenue of up to 0.35%. All reinforcement learning controllers compute dispatch actions in less than 0.3 s, resulting in a computational speedup of more than three orders of magnitude over the nonlinear programming baseline. The findings of this paper highlight their applicability for real-time operation and control in nuclear-integrated microgrids under volatile operating conditions.</div></div>\",\"PeriodicalId\":50630,\"journal\":{\"name\":\"Computers & Electrical Engineering\",\"volume\":\"126 \",\"pages\":\"Article 110528\"},\"PeriodicalIF\":4.9000,\"publicationDate\":\"2025-06-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers & Electrical Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0045790625004719\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Electrical Engineering","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0045790625004719","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

摘要

随着电网整合了更多可变的可再生能源，如风能和太阳能，对分布式和灵活系统的需求变得至关重要，以解决这种增加的可变性。核能驱动的微电网提供了一个很有前途的解决方案，通过提供稳定的发电来补充间歇性的可再生能源，确保电网的可靠性和运行效率。本文提出了一个循环深度强化学习框架，用于集成可再生能源、小型模块化反应堆、电池存储系统和电厂平衡动态的核动力微电网的最优经济调度。开发了三智能体控制体系结构，其中需求和可再生能源智能体作为预测者，基于强化学习的调度智能体执行实时能源分配。首先使用非线性规划公式生成基准测试的最优基线。本文提出的调度控制器基于长短期记忆网络增强的近端策略优化，利用系统动力学中的时间相关性，利用时间序列作为输入来提高不确定性下的策略鲁棒性。与已建立的深度强化学习方法（包括前馈结构的近端策略优化、软行为者批评和双延迟深度确定性策略梯度）进行比较分析，证明了其优越的性能。数值结果表明，与非线性规划基准相比，该控制器的成本降低了0.39%，并且通过产生高达0.35%的额外收益，优于其他基于学习的方法。所有强化学习控制器在0.3秒内计算调度动作，使得计算速度比非线性规划基线提高了三个数量级以上。本文的研究结果表明，该方法适用于核集成微电网在不稳定运行条件下的实时运行和控制。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Comprehensive assessment of deep reinforcement learning approaches for economic dispatch in nuclear-driven microgrids

As the electrical grid integrates more variable renewable energy sources such as wind and solar, the demand for distributed and flexible systems to address this increased variability becomes critical. Nuclear-driven microgrids provide a promising solution by offering stable generation to complement intermittent renewables, ensuring grid reliability and operating efficiency. This paper proposes a recurrent deep reinforcement learning framework for optimal economic dispatch in a nuclear-powered microgrid integrating renewable energy sources, small modular reactors, battery storage systems, and balance-of-plant dynamics. A three-agent control architecture is developed, where demand and renewable energy agents act as forecasters, and a reinforcement learning-based dispatch agent performs real-time energy allocation. A nonlinear programming formulation is first used to generate an optimal baseline for benchmarking. The proposed dispatch controller, based on Proximal Policy Optimization enhanced with Long Short-Term Memory networks, exploits temporal correlations in system dynamics by taking advantage of the time series used as inputs to improve policy robustness under uncertainty. Comparative analysis against established deep reinforcement learning methods, including Proximal Policy Optimization with a feedforward architecture, Soft Actor-Critic, and Twin Delayed Deep Deterministic Policy Gradient, demonstrates superior performance. Numerical results indicate that the proposed controller achieves a 0.39% cost reduction relative to the nonlinear programming benchmark and outperforms other learning-based methods by generating additional revenue of up to 0.35%. All reinforcement learning controllers compute dispatch actions in less than 0.3 s, resulting in a computational speedup of more than three orders of magnitude over the nonlinear programming baseline. The findings of this paper highlight their applicability for real-time operation and control in nuclear-integrated microgrids under volatile operating conditions.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computers & Electrical Engineering 工程技术-工程：电子与电气

CiteScore

9.20

自引率

7.00%

发文量

661

审稿时长

47 days

期刊介绍： The impact of computers has nowhere been more revolutionary than in electrical engineering. The design, analysis, and operation of electrical and electronic systems are now dominated by computers, a transformation that has been motivated by the natural ease of interface between computers and electrical systems, and the promise of spectacular improvements in speed and efficiency. Published since 1973, Computers & Electrical Engineering provides rapid publication of topical research into the integration of computer technology and computational techniques with electrical and electronic systems. The journal publishes papers featuring novel implementations of computers and computational techniques in areas like signal and image processing, high-performance computing, parallel processing, and communications. Special attention will be paid to papers describing innovative architectures, algorithms, and software tools.