基于TD3算法的DC-DC开关变换器强化学习控制器

Jian Ye, Huanyu Guo, Sen Mei, Yingjie Hu, Xinan Zhang
{"title":"基于TD3算法的DC-DC开关变换器强化学习控制器","authors":"Jian Ye, Huanyu Guo, Sen Mei, Yingjie Hu, Xinan Zhang","doi":"10.1109/ICoPESA56898.2023.10141314","DOIUrl":null,"url":null,"abstract":"Various linear and nonlinear controllers have been developed to improve the dynamic performance of DC-DC converters. Most controllers can only be designed on the basis of understanding the mathematical model of DC-DC converter, but the inherent nonlinear and time-varying characteristics of DC-DC switching converter make it difficult to complete the precise modeling, so the model-based control design is complex and the control performance is limited. In order to overcome the problem, this paper proposes a reinforcement learning (RL) controller based on the twin-delayed deep deterministic policy gradient (TD3) algorithm. This controller does not need the model of the switching converter. The converter will be regarded as a black box model, the policy approximation function (policy neural network) can be trained and learned by constructing a Markov decision process interacting with the black box model in the control system, and the optimal control action can be output. The RL controller is developed based on actor critic architecture, and a TD3 algorithm with higher learning efficiency is proposed to improve the control performance of the RL controller. The proposed RL controller based on TD3 algorithm is compared with the traditional PI controller. The simulation results show that the RL controller has better dynamic performance when the converter starts and the load step changes.","PeriodicalId":127339,"journal":{"name":"2023 International Conference on Power Energy Systems and Applications (ICoPESA)","volume":"499 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A TD3 Algorithm Based Reinforcement Learning Controller for DC-DC Switching Converters\",\"authors\":\"Jian Ye, Huanyu Guo, Sen Mei, Yingjie Hu, Xinan Zhang\",\"doi\":\"10.1109/ICoPESA56898.2023.10141314\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Various linear and nonlinear controllers have been developed to improve the dynamic performance of DC-DC converters. Most controllers can only be designed on the basis of understanding the mathematical model of DC-DC converter, but the inherent nonlinear and time-varying characteristics of DC-DC switching converter make it difficult to complete the precise modeling, so the model-based control design is complex and the control performance is limited. In order to overcome the problem, this paper proposes a reinforcement learning (RL) controller based on the twin-delayed deep deterministic policy gradient (TD3) algorithm. This controller does not need the model of the switching converter. The converter will be regarded as a black box model, the policy approximation function (policy neural network) can be trained and learned by constructing a Markov decision process interacting with the black box model in the control system, and the optimal control action can be output. The RL controller is developed based on actor critic architecture, and a TD3 algorithm with higher learning efficiency is proposed to improve the control performance of the RL controller. The proposed RL controller based on TD3 algorithm is compared with the traditional PI controller. The simulation results show that the RL controller has better dynamic performance when the converter starts and the load step changes.\",\"PeriodicalId\":127339,\"journal\":{\"name\":\"2023 International Conference on Power Energy Systems and Applications (ICoPESA)\",\"volume\":\"499 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-02-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 International Conference on Power Energy Systems and Applications (ICoPESA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICoPESA56898.2023.10141314\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 International Conference on Power Energy Systems and Applications (ICoPESA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICoPESA56898.2023.10141314","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

为了改善DC-DC变换器的动态性能,人们开发了各种线性和非线性控制器。大多数控制器只能在了解DC-DC变换器数学模型的基础上进行设计,但DC-DC开关变换器固有的非线性和时变特性使得难以完成精确的建模,因此基于模型的控制设计复杂,控制性能受到限制。为了克服这一问题,本文提出了一种基于双延迟深度确定性策略梯度(TD3)算法的强化学习(RL)控制器。该控制器不需要开关变换器的模型。将变换器视为一个黑盒模型,通过在控制系统中构造一个与黑盒模型交互的马尔可夫决策过程来训练和学习策略逼近函数(策略神经网络),并输出最优控制动作。基于actor critic架构开发了RL控制器,并提出了一种学习效率更高的TD3算法来提高RL控制器的控制性能。将基于TD3算法的RL控制器与传统PI控制器进行了比较。仿真结果表明,在变流器启动和负载阶跃变化时,RL控制器具有较好的动态性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A TD3 Algorithm Based Reinforcement Learning Controller for DC-DC Switching Converters
Various linear and nonlinear controllers have been developed to improve the dynamic performance of DC-DC converters. Most controllers can only be designed on the basis of understanding the mathematical model of DC-DC converter, but the inherent nonlinear and time-varying characteristics of DC-DC switching converter make it difficult to complete the precise modeling, so the model-based control design is complex and the control performance is limited. In order to overcome the problem, this paper proposes a reinforcement learning (RL) controller based on the twin-delayed deep deterministic policy gradient (TD3) algorithm. This controller does not need the model of the switching converter. The converter will be regarded as a black box model, the policy approximation function (policy neural network) can be trained and learned by constructing a Markov decision process interacting with the black box model in the control system, and the optimal control action can be output. The RL controller is developed based on actor critic architecture, and a TD3 algorithm with higher learning efficiency is proposed to improve the control performance of the RL controller. The proposed RL controller based on TD3 algorithm is compared with the traditional PI controller. The simulation results show that the RL controller has better dynamic performance when the converter starts and the load step changes.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信