输入约束非线性离散质量的分布式最优一致性问题:一种无模强化学习方法

IF 9.4 1区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS
Shuxing Xuan;Hongjing Liang;Shihao Huang;Tieshan Li;Jiayue Sun
{"title":"输入约束非线性离散质量的分布式最优一致性问题:一种无模强化学习方法","authors":"Shuxing Xuan;Hongjing Liang;Shihao Huang;Tieshan Li;Jiayue Sun","doi":"10.1109/TCYB.2025.3562390","DOIUrl":null,"url":null,"abstract":"In this article, a model-free reinforcement learning (RL) approach is proposed for solving the optimal consensus control issue of nonlinear discrete-time multiagent systems with input constraint. To address the challenge of solving the coupled discrete Hamilton-Jacobi–Bellman (HJB) equation, a RL approach based on actor-critic framework is proposed for optimal consensus control. A well-defined cost function is designed, and the actor and critic networks are updated through online learning to obtain the optimal controllers. Furthermore, the actuator’s performance is often limited due to physical constraints. To address such actuator constraints, a gradual transition control (GTC) method is proposed, and update-free and update-weak policies are introduced to further optimize network performance. Additionally, in real-world distributed systems, the actor-critic networks deployed in each agent rely on data from neighboring agents, which necessitates addressing the issue of distributed synchronization. To address this challenge, the synchronization blocking method is designed, which designs additional control signals for each agent to handle these issues. Finally, two simulations under different scenarios are presented to verify the effectiveness of the proposed approach.","PeriodicalId":13112,"journal":{"name":"IEEE Transactions on Cybernetics","volume":"55 6","pages":"2910-2923"},"PeriodicalIF":9.4000,"publicationDate":"2025-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Distributed Optimal Consensus Problem of Input Constrained Nonlinear Discrete-Time MASs: A Mode-Free Reinforcement Learning Approach\",\"authors\":\"Shuxing Xuan;Hongjing Liang;Shihao Huang;Tieshan Li;Jiayue Sun\",\"doi\":\"10.1109/TCYB.2025.3562390\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this article, a model-free reinforcement learning (RL) approach is proposed for solving the optimal consensus control issue of nonlinear discrete-time multiagent systems with input constraint. To address the challenge of solving the coupled discrete Hamilton-Jacobi–Bellman (HJB) equation, a RL approach based on actor-critic framework is proposed for optimal consensus control. A well-defined cost function is designed, and the actor and critic networks are updated through online learning to obtain the optimal controllers. Furthermore, the actuator’s performance is often limited due to physical constraints. To address such actuator constraints, a gradual transition control (GTC) method is proposed, and update-free and update-weak policies are introduced to further optimize network performance. Additionally, in real-world distributed systems, the actor-critic networks deployed in each agent rely on data from neighboring agents, which necessitates addressing the issue of distributed synchronization. To address this challenge, the synchronization blocking method is designed, which designs additional control signals for each agent to handle these issues. Finally, two simulations under different scenarios are presented to verify the effectiveness of the proposed approach.\",\"PeriodicalId\":13112,\"journal\":{\"name\":\"IEEE Transactions on Cybernetics\",\"volume\":\"55 6\",\"pages\":\"2910-2923\"},\"PeriodicalIF\":9.4000,\"publicationDate\":\"2025-04-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Cybernetics\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10980068/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Cybernetics","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10980068/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

本文提出了一种无模型强化学习(RL)方法来解决具有输入约束的非线性离散多智能体系统的最优共识控制问题。为了解决耦合离散Hamilton-Jacobi-Bellman (HJB)方程的求解问题,提出了一种基于行为者-批评家框架的强化学习方法来实现最优共识控制。设计了一个定义良好的代价函数,并通过在线学习更新行动者和评论家网络以获得最优控制器。此外,执行器的性能往往受到物理约束的限制。针对此类约束,提出了渐进过渡控制(GTC)方法,并引入无更新和弱更新策略进一步优化网络性能。此外,在真实的分布式系统中,部署在每个代理中的参与者批评网络依赖于来自相邻代理的数据,这就需要解决分布式同步问题。为了解决这个问题,设计了同步阻塞方法,该方法为每个代理设计了额外的控制信号来处理这些问题。最后,给出了两个不同场景下的仿真,验证了所提方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Distributed Optimal Consensus Problem of Input Constrained Nonlinear Discrete-Time MASs: A Mode-Free Reinforcement Learning Approach
In this article, a model-free reinforcement learning (RL) approach is proposed for solving the optimal consensus control issue of nonlinear discrete-time multiagent systems with input constraint. To address the challenge of solving the coupled discrete Hamilton-Jacobi–Bellman (HJB) equation, a RL approach based on actor-critic framework is proposed for optimal consensus control. A well-defined cost function is designed, and the actor and critic networks are updated through online learning to obtain the optimal controllers. Furthermore, the actuator’s performance is often limited due to physical constraints. To address such actuator constraints, a gradual transition control (GTC) method is proposed, and update-free and update-weak policies are introduced to further optimize network performance. Additionally, in real-world distributed systems, the actor-critic networks deployed in each agent rely on data from neighboring agents, which necessitates addressing the issue of distributed synchronization. To address this challenge, the synchronization blocking method is designed, which designs additional control signals for each agent to handle these issues. Finally, two simulations under different scenarios are presented to verify the effectiveness of the proposed approach.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE Transactions on Cybernetics
IEEE Transactions on Cybernetics COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-COMPUTER SCIENCE, CYBERNETICS
CiteScore
25.40
自引率
11.00%
发文量
1869
期刊介绍: The scope of the IEEE Transactions on Cybernetics includes computational approaches to the field of cybernetics. Specifically, the transactions welcomes papers on communication and control across machines or machine, human, and organizations. The scope includes such areas as computational intelligence, computer vision, neural networks, genetic algorithms, machine learning, fuzzy systems, cognitive systems, decision making, and robotics, to the extent that they contribute to the theme of cybernetics or demonstrate an application of cybernetics principles.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信