输入约束非线性离散质量的分布式最优一致性问题：一种无模强化学习方法

IF 9.4 1区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

IEEE Transactions on Cybernetics Pub Date : 2025-04-29 DOI:10.1109/TCYB.2025.3562390

Shuxing Xuan;Hongjing Liang;Shihao Huang;Tieshan Li;Jiayue Sun

{"title":"输入约束非线性离散质量的分布式最优一致性问题：一种无模强化学习方法","authors":"Shuxing Xuan;Hongjing Liang;Shihao Huang;Tieshan Li;Jiayue Sun","doi":"10.1109/TCYB.2025.3562390","DOIUrl":null,"url":null,"abstract":"In this article, a model-free reinforcement learning (RL) approach is proposed for solving the optimal consensus control issue of nonlinear discrete-time multiagent systems with input constraint. To address the challenge of solving the coupled discrete Hamilton-Jacobi–Bellman (HJB) equation, a RL approach based on actor-critic framework is proposed for optimal consensus control. A well-defined cost function is designed, and the actor and critic networks are updated through online learning to obtain the optimal controllers. Furthermore, the actuator’s performance is often limited due to physical constraints. To address such actuator constraints, a gradual transition control (GTC) method is proposed, and update-free and update-weak policies are introduced to further optimize network performance. Additionally, in real-world distributed systems, the actor-critic networks deployed in each agent rely on data from neighboring agents, which necessitates addressing the issue of distributed synchronization. To address this challenge, the synchronization blocking method is designed, which designs additional control signals for each agent to handle these issues. Finally, two simulations under different scenarios are presented to verify the effectiveness of the proposed approach.","PeriodicalId":13112,"journal":{"name":"IEEE Transactions on Cybernetics","volume":"55 6","pages":"2910-2923"},"PeriodicalIF":9.4000,"publicationDate":"2025-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Distributed Optimal Consensus Problem of Input Constrained Nonlinear Discrete-Time MASs: A Mode-Free Reinforcement Learning Approach\",\"authors\":\"Shuxing Xuan;Hongjing Liang;Shihao Huang;Tieshan Li;Jiayue Sun\",\"doi\":\"10.1109/TCYB.2025.3562390\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this article, a model-free reinforcement learning (RL) approach is proposed for solving the optimal consensus control issue of nonlinear discrete-time multiagent systems with input constraint. To address the challenge of solving the coupled discrete Hamilton-Jacobi–Bellman (HJB) equation, a RL approach based on actor-critic framework is proposed for optimal consensus control. A well-defined cost function is designed, and the actor and critic networks are updated through online learning to obtain the optimal controllers. Furthermore, the actuator’s performance is often limited due to physical constraints. To address such actuator constraints, a gradual transition control (GTC) method is proposed, and update-free and update-weak policies are introduced to further optimize network performance. Additionally, in real-world distributed systems, the actor-critic networks deployed in each agent rely on data from neighboring agents, which necessitates addressing the issue of distributed synchronization. To address this challenge, the synchronization blocking method is designed, which designs additional control signals for each agent to handle these issues. Finally, two simulations under different scenarios are presented to verify the effectiveness of the proposed approach.\",\"PeriodicalId\":13112,\"journal\":{\"name\":\"IEEE Transactions on Cybernetics\",\"volume\":\"55 6\",\"pages\":\"2910-2923\"},\"PeriodicalIF\":9.4000,\"publicationDate\":\"2025-04-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Cybernetics\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10980068/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Cybernetics","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10980068/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

本文提出了一种无模型强化学习（RL）方法来解决具有输入约束的非线性离散多智能体系统的最优共识控制问题。为了解决耦合离散Hamilton-Jacobi-Bellman （HJB）方程的求解问题，提出了一种基于行为者-批评家框架的强化学习方法来实现最优共识控制。设计了一个定义良好的代价函数，并通过在线学习更新行动者和评论家网络以获得最优控制器。此外，执行器的性能往往受到物理约束的限制。针对此类约束，提出了渐进过渡控制（GTC）方法，并引入无更新和弱更新策略进一步优化网络性能。此外，在真实的分布式系统中，部署在每个代理中的参与者批评网络依赖于来自相邻代理的数据，这就需要解决分布式同步问题。为了解决这个问题，设计了同步阻塞方法，该方法为每个代理设计了额外的控制信号来处理这些问题。最后，给出了两个不同场景下的仿真，验证了所提方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Distributed Optimal Consensus Problem of Input Constrained Nonlinear Discrete-Time MASs: A Mode-Free Reinforcement Learning Approach

In this article, a model-free reinforcement learning (RL) approach is proposed for solving the optimal consensus control issue of nonlinear discrete-time multiagent systems with input constraint. To address the challenge of solving the coupled discrete Hamilton-Jacobi–Bellman (HJB) equation, a RL approach based on actor-critic framework is proposed for optimal consensus control. A well-defined cost function is designed, and the actor and critic networks are updated through online learning to obtain the optimal controllers. Furthermore, the actuator’s performance is often limited due to physical constraints. To address such actuator constraints, a gradual transition control (GTC) method is proposed, and update-free and update-weak policies are introduced to further optimize network performance. Additionally, in real-world distributed systems, the actor-critic networks deployed in each agent rely on data from neighboring agents, which necessitates addressing the issue of distributed synchronization. To address this challenge, the synchronization blocking method is designed, which designs additional control signals for each agent to handle these issues. Finally, two simulations under different scenarios are presented to verify the effectiveness of the proposed approach.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Cybernetics COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-COMPUTER SCIENCE, CYBERNETICS

CiteScore

25.40

自引率

11.00%

发文量

1869

期刊介绍： The scope of the IEEE Transactions on Cybernetics includes computational approaches to the field of cybernetics. Specifically, the transactions welcomes papers on communication and control across machines or machine, human, and organizations. The scope includes such areas as computational intelligence, computer vision, neural networks, genetic algorithms, machine learning, fuzzy systems, cognitive systems, decision making, and robotics, to the extent that they contribute to the theme of cybernetics or demonstrate an application of cybernetics principles.