{"title":"A Benders-Combined Safe Reinforcement Learning Framework for Risk-Averse Dispatch Considering Frequency Security Constraints","authors":"Jianbing Feng;Zhouyang Ren;Chen Li;Wenyuan Li","doi":"10.1109/TCSII.2025.3584894","DOIUrl":null,"url":null,"abstract":"Risk-averse dispatch considering frequency security constraints (FSC-RD) mitigates power supply-demand imbalance risks and frequency instability hazards. To effectively address the highly complex, multi-task coupled FSC-RD, this brief proposes a Benders-combined constrained Markov decision process (BC-CMDP) framework, which integrates logic-based Benders decomposition and safe reinforcement learning. A natural policy gradient primal-dual optimization is developed to handle the nonconvex policy optimization within the BC-CMDP. The global non-asymptotic convergence of the BC-CMDP framework is rigorously proven. The proposed framework is validated on the IEEE 118-bus system.","PeriodicalId":13101,"journal":{"name":"IEEE Transactions on Circuits and Systems II: Express Briefs","volume":"72 8","pages":"1063-1067"},"PeriodicalIF":4.9000,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Circuits and Systems II: Express Briefs","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/11062567/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Risk-averse dispatch considering frequency security constraints (FSC-RD) mitigates power supply-demand imbalance risks and frequency instability hazards. To effectively address the highly complex, multi-task coupled FSC-RD, this brief proposes a Benders-combined constrained Markov decision process (BC-CMDP) framework, which integrates logic-based Benders decomposition and safe reinforcement learning. A natural policy gradient primal-dual optimization is developed to handle the nonconvex policy optimization within the BC-CMDP. The global non-asymptotic convergence of the BC-CMDP framework is rigorously proven. The proposed framework is validated on the IEEE 118-bus system.
期刊介绍:
TCAS II publishes brief papers in the field specified by the theory, analysis, design, and practical implementations of circuits, and the application of circuit techniques to systems and to signal processing. Included is the whole spectrum from basic scientific theory to industrial applications. The field of interest covered includes:
Circuits: Analog, Digital and Mixed Signal Circuits and Systems
Nonlinear Circuits and Systems, Integrated Sensors, MEMS and Systems on Chip, Nanoscale Circuits and Systems, Optoelectronic
Circuits and Systems, Power Electronics and Systems
Software for Analog-and-Logic Circuits and Systems
Control aspects of Circuits and Systems.