Prescribed-time reinforcement learning formation control of nonlinear MASs with an unknown dynamic leader

IF 3.4 2区数学 Q1 MATHEMATICS, APPLIED

Applied Mathematics and Computation Pub Date : 2025-09-07 DOI:10.1016/j.amc.2025.129711

Benxin Zhao , Yuan-Xin Li , Zhongsheng Hou

{"title":"Prescribed-time reinforcement learning formation control of nonlinear MASs with an unknown dynamic leader","authors":"Benxin Zhao , Yuan-Xin Li , Zhongsheng Hou","doi":"10.1016/j.amc.2025.129711","DOIUrl":null,"url":null,"abstract":"<div><div>This paper investigates the prescribed-time optimal formation control problem for nonlinear multi-agent systems (MASs) with an unknown dynamic leader. The agents need to not only form a predefined formation pattern but also track the leader’s trajectory in a prescribed time. A hierarchical control framework, which includes the communication layer and the tracking control layer, is established to address the formulated issue. In the communication layer, a distributed prescribed-time observer is established to accurately estimate the leader’s information, which can make observation error convergences to zero in a prescribed time. In particular, the leaders’ uncertainties are solved by constructing a novel adaptive law. With the estimated results, a novel transformation relationship and the prescribed-time adjustment function are constructed to guarantee that formation tracking error converges to the predefined accuracy in a prescribed time. Subsequently, the reinforcement learning (RL) algorithm with the fuzzy logic systems (FLSs) is devised to optimize system performance. Based on the Lyapunov stability theory, it is shown that the formation errors <span><math><mrow><msub><mi>ξ</mi><mrow><mi>i</mi><mo>,</mo><mn>1</mn></mrow></msub><mo>=</mo><msub><mi>x</mi><mrow><mi>i</mi><mo>,</mo><mn>1</mn></mrow></msub><mo>−</mo><msub><mover><mi>y</mi><mo>^</mo></mover><mrow><mn>0</mn><mo>,</mo><mi>i</mi></mrow></msub><mo>−</mo><msub><mi>η</mi><mrow><mi>i</mi><mo>,</mo><mi>d</mi></mrow></msub></mrow></math></span> are consistently confined within an interval <span><math><mrow><mo>(</mo><mi>tan</mi><mrow><mo>(</mo><mo>−</mo><mfrac><mi>π</mi><mrow><mn>2</mn><msub><mi>μ</mi><mn>2</mn></msub></mrow></mfrac><mo>)</mo></mrow><mo>,</mo><mi>tan</mi><mrow><mo>(</mo><mfrac><mi>π</mi><mrow><mn>2</mn><msub><mi>μ</mi><mn>2</mn></msub></mrow></mfrac><mo>)</mo></mrow><mo>)</mo></mrow></math></span>, while all signals of the closed-loop system are bounded. Ultimately, the superiority of the devised algorithm is demonstrated by a representative example.</div></div>","PeriodicalId":55496,"journal":{"name":"Applied Mathematics and Computation","volume":"510 ","pages":"Article 129711"},"PeriodicalIF":3.4000,"publicationDate":"2025-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Mathematics and Computation","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0096300325004370","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICS, APPLIED","Score":null,"Total":0}

引用次数: 0

Abstract

This paper investigates the prescribed-time optimal formation control problem for nonlinear multi-agent systems (MASs) with an unknown dynamic leader. The agents need to not only form a predefined formation pattern but also track the leader’s trajectory in a prescribed time. A hierarchical control framework, which includes the communication layer and the tracking control layer, is established to address the formulated issue. In the communication layer, a distributed prescribed-time observer is established to accurately estimate the leader’s information, which can make observation error convergences to zero in a prescribed time. In particular, the leaders’ uncertainties are solved by constructing a novel adaptive law. With the estimated results, a novel transformation relationship and the prescribed-time adjustment function are constructed to guarantee that formation tracking error converges to the predefined accuracy in a prescribed time. Subsequently, the reinforcement learning (RL) algorithm with the fuzzy logic systems (FLSs) is devised to optimize system performance. Based on the Lyapunov stability theory, it is shown that the formation errors

ξ_{i, 1} = x_{i, 1} - {\hat{y}}_{0, i} - η_{i, d}

are consistently confined within an interval

(\tan (- \frac{π}{2 μ_{2}}), \tan (\frac{π}{2 μ_{2}}))

, while all signals of the closed-loop system are bounded. Ultimately, the superiority of the devised algorithm is demonstrated by a representative example.

查看原文本刊更多论文

具有未知动态领导者的非线性质量的规定时间强化学习编队控制

研究了具有未知动态领导者的非线性多智能体系统的规定时间最优群体控制问题。agent不仅需要形成预定的队形，还需要在规定的时间内跟踪leader的轨迹。为解决上述问题，建立了包括通信层和跟踪控制层在内的分层控制框架。在通信层，建立了一个分布式的规定时间观测器来准确估计领导者的信息，使观察误差在规定时间内收敛到零。特别地，通过构建一种新的自适应律来解决领导者的不确定性。根据估计结果，构造了一种新的变换关系和规定时间调整函数，保证了编队跟踪误差在规定时间收敛到规定精度。随后，设计了基于模糊逻辑系统的强化学习（RL）算法来优化系统性能。基于Lyapunov稳定性理论，证明了形成误差ξ，1=xi,1 - y^0,i - ηi，d一致地限制在区间(tan（- π2μ2),tan(π2μ2)）内，而闭环系统的所有信号都是有界的。最后，通过一个典型算例验证了所设计算法的优越性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Applied Mathematics and Computation 数学-应用数学

CiteScore

7.90

自引率

10.00%

发文量

755

审稿时长

36 days

期刊介绍： Applied Mathematics and Computation addresses work at the interface between applied mathematics, numerical computation, and applications of systems – oriented ideas to the physical, biological, social, and behavioral sciences, and emphasizes papers of a computational nature focusing on new algorithms, their analysis and numerical results. In addition to presenting research papers, Applied Mathematics and Computation publishes review articles and single–topics issues.