最优频率控制的稳定强化学习：一种基于分布平均的积分方法

IEEE open journal of control systems Pub Date : 2022-08-29 DOI:10.1109/OJCSYS.2022.3202202

Yan Jiang;Wenqi Cui;Baosen Zhang;Jorge Cortés

{"title":"最优频率控制的稳定强化学习：一种基于分布平均的积分方法","authors":"Yan Jiang;Wenqi Cui;Baosen Zhang;Jorge Cortés","doi":"10.1109/OJCSYS.2022.3202202","DOIUrl":null,"url":null,"abstract":"Frequency control plays a pivotal role in reliable power system operations. It is conventionally performed in a hierarchical way that first rapidly stabilizes the frequency deviations and then slowly recovers the nominal frequency. However, as the generation mix shifts from synchronous generators to renewable resources, power systems experience larger and faster frequency fluctuations due to the loss of inertia, which adversely impacts the frequency stability. This has motivated active research in algorithms that jointly address frequency degradation and economic efficiency in a fast timescale, among which the distributed averaging-based integral (DAI) control is a notable one that sets controllable power injections directly proportional to the integrals of frequency deviation and economic inefficiency signals. Nevertheless, DAI does not typically consider the transient performance of the system following power disturbances and has been restricted to quadratic operational cost functions. This paper aims to leverage nonlinear optimal controllers to simultaneously achieve optimal transient frequency control and find the most economic power dispatch for frequency restoration. To this end, we integrate reinforcement learning (RL) to the classic DAI, which results in RL-DAI control. Specifically, we use RL to learn a neural network-based control policy mapping from the integral variables of DAI to the controllable power injections which provides optimal transient frequency control, while DAI inherently ensures the frequency restoration and optimal economic dispatch. Compared to existing methods, we provide provable guarantees on the stability of the learned controllers and extend the set of allowable cost functions to a much larger class. Simulations on the 39-bus New England system illustrate our results.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"1 ","pages":"194-209"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/9552933/9683993/09869334.pdf","citationCount":"6","resultStr":"{\"title\":\"Stable Reinforcement Learning for Optimal Frequency Control: A Distributed Averaging-Based Integral Approach\",\"authors\":\"Yan Jiang;Wenqi Cui;Baosen Zhang;Jorge Cortés\",\"doi\":\"10.1109/OJCSYS.2022.3202202\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Frequency control plays a pivotal role in reliable power system operations. It is conventionally performed in a hierarchical way that first rapidly stabilizes the frequency deviations and then slowly recovers the nominal frequency. However, as the generation mix shifts from synchronous generators to renewable resources, power systems experience larger and faster frequency fluctuations due to the loss of inertia, which adversely impacts the frequency stability. This has motivated active research in algorithms that jointly address frequency degradation and economic efficiency in a fast timescale, among which the distributed averaging-based integral (DAI) control is a notable one that sets controllable power injections directly proportional to the integrals of frequency deviation and economic inefficiency signals. Nevertheless, DAI does not typically consider the transient performance of the system following power disturbances and has been restricted to quadratic operational cost functions. This paper aims to leverage nonlinear optimal controllers to simultaneously achieve optimal transient frequency control and find the most economic power dispatch for frequency restoration. To this end, we integrate reinforcement learning (RL) to the classic DAI, which results in RL-DAI control. Specifically, we use RL to learn a neural network-based control policy mapping from the integral variables of DAI to the controllable power injections which provides optimal transient frequency control, while DAI inherently ensures the frequency restoration and optimal economic dispatch. Compared to existing methods, we provide provable guarantees on the stability of the learned controllers and extend the set of allowable cost functions to a much larger class. Simulations on the 39-bus New England system illustrate our results.\",\"PeriodicalId\":73299,\"journal\":{\"name\":\"IEEE open journal of control systems\",\"volume\":\"1 \",\"pages\":\"194-209\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-08-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/iel7/9552933/9683993/09869334.pdf\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE open journal of control systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/9869334/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE open journal of control systems","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/9869334/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

摘要

频率控制在电力系统可靠运行中起着关键作用。它通常以分级方式执行，首先快速稳定频率偏差，然后缓慢恢复标称频率。然而，随着发电组合从同步发电机转向可再生资源，由于惯性损失，电力系统会经历更大、更快的频率波动，这会对频率稳定性产生不利影响。这促使人们积极研究在快速时间尺度上联合解决频率退化和经济效率问题的算法，其中基于分布式平均的积分（DAI）控制是一种值得注意的控制，它将可控功率注入设置为与频率偏差和经济效率信号的积分成正比。然而，DAI通常不考虑电力扰动后系统的瞬态性能，并且被限制为二次运行成本函数。本文旨在利用非线性最优控制器同时实现最优瞬态频率控制，并找到最经济的频率恢复电力调度。为此，我们将强化学习（RL）与经典的DAI相结合，从而实现RL-DAI控制。具体而言，我们使用RL来学习从DAI的积分变量到可控功率注入的基于神经网络的控制策略映射，该映射提供了最优的瞬态频率控制，而DAI本质上确保了频率恢复和最优经济调度。与现有方法相比，我们对学习控制器的稳定性提供了可证明的保证，并将允许代价函数集扩展到一个更大的类。在新英格兰39路公交车系统上的模拟说明了我们的结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Stable Reinforcement Learning for Optimal Frequency Control: A Distributed Averaging-Based Integral Approach

Frequency control plays a pivotal role in reliable power system operations. It is conventionally performed in a hierarchical way that first rapidly stabilizes the frequency deviations and then slowly recovers the nominal frequency. However, as the generation mix shifts from synchronous generators to renewable resources, power systems experience larger and faster frequency fluctuations due to the loss of inertia, which adversely impacts the frequency stability. This has motivated active research in algorithms that jointly address frequency degradation and economic efficiency in a fast timescale, among which the distributed averaging-based integral (DAI) control is a notable one that sets controllable power injections directly proportional to the integrals of frequency deviation and economic inefficiency signals. Nevertheless, DAI does not typically consider the transient performance of the system following power disturbances and has been restricted to quadratic operational cost functions. This paper aims to leverage nonlinear optimal controllers to simultaneously achieve optimal transient frequency control and find the most economic power dispatch for frequency restoration. To this end, we integrate reinforcement learning (RL) to the classic DAI, which results in RL-DAI control. Specifically, we use RL to learn a neural network-based control policy mapping from the integral variables of DAI to the controllable power injections which provides optimal transient frequency control, while DAI inherently ensures the frequency restoration and optimal economic dispatch. Compared to existing methods, we provide provable guarantees on the stability of the learned controllers and extend the set of allowable cost functions to a much larger class. Simulations on the 39-bus New England system illustrate our results.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE open journal of control systems

自引率

0.00%

发文量