Zhai Peiyu, Qin Kaiyu, Yue Jiangfeng, Lin Boxian, Li Weihao, Shi Mengji
{"title":"基于增强自适应策略迭代的不确定网络智能体强化学习最优二部队形跟踪","authors":"Zhai Peiyu, Qin Kaiyu, Yue Jiangfeng, Lin Boxian, Li Weihao, Shi Mengji","doi":"10.1007/s10489-025-06913-4","DOIUrl":null,"url":null,"abstract":"<div><p>Optimal bipartite formation tracking of uncertain networked agent systems (NASs) is a hotspot with extensive applications in many fields, and there is an urgent demand for strategies that optimize system performance while ensuring efficiency and stability. With this in mind, this paper proposes a reinforcement learning-based optimal control scheme using an Enhanced Adaptive Policy Iteration (EAPI) algorithm with an adaptive termination mechanism. This scheme enables follower agents to achieve bipartite formation tracking of the leader while optimizing the performance index. Firstly, the definition of the optimal bipartite formation tracking control problem is presented, and the Bellman form of the optimal value function and control law is derived based on the coupled Hamilton-Jacobi-Bellman (HJB) equations. Then, an EAPI algorithm with an adaptive termination mechanism is introduced, which could avoid repeated iterations by setting a termination threshold, thus reducing the computational cost and decreasing the running time without reducing the control performance. Furthermore, the stability, convergence, and optimality of EAPI algorithm are analyzed. Moreover, the optimal control law is approximated and solved through the reinforcement learning framework. Finally, numerical results are conducted to verify the effectiveness of the proposed EAPI-based optimal bipartite formation tracking scheme for NASs.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 15","pages":""},"PeriodicalIF":3.5000,"publicationDate":"2025-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Reinforcement learning-based optimal bipartite formation tracking for uncertain networked agents via enhanced adaptive policy iteration\",\"authors\":\"Zhai Peiyu, Qin Kaiyu, Yue Jiangfeng, Lin Boxian, Li Weihao, Shi Mengji\",\"doi\":\"10.1007/s10489-025-06913-4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Optimal bipartite formation tracking of uncertain networked agent systems (NASs) is a hotspot with extensive applications in many fields, and there is an urgent demand for strategies that optimize system performance while ensuring efficiency and stability. With this in mind, this paper proposes a reinforcement learning-based optimal control scheme using an Enhanced Adaptive Policy Iteration (EAPI) algorithm with an adaptive termination mechanism. This scheme enables follower agents to achieve bipartite formation tracking of the leader while optimizing the performance index. Firstly, the definition of the optimal bipartite formation tracking control problem is presented, and the Bellman form of the optimal value function and control law is derived based on the coupled Hamilton-Jacobi-Bellman (HJB) equations. Then, an EAPI algorithm with an adaptive termination mechanism is introduced, which could avoid repeated iterations by setting a termination threshold, thus reducing the computational cost and decreasing the running time without reducing the control performance. Furthermore, the stability, convergence, and optimality of EAPI algorithm are analyzed. Moreover, the optimal control law is approximated and solved through the reinforcement learning framework. Finally, numerical results are conducted to verify the effectiveness of the proposed EAPI-based optimal bipartite formation tracking scheme for NASs.</p></div>\",\"PeriodicalId\":8041,\"journal\":{\"name\":\"Applied Intelligence\",\"volume\":\"55 15\",\"pages\":\"\"},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2025-10-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10489-025-06913-4\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Intelligence","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10489-025-06913-4","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Reinforcement learning-based optimal bipartite formation tracking for uncertain networked agents via enhanced adaptive policy iteration
Optimal bipartite formation tracking of uncertain networked agent systems (NASs) is a hotspot with extensive applications in many fields, and there is an urgent demand for strategies that optimize system performance while ensuring efficiency and stability. With this in mind, this paper proposes a reinforcement learning-based optimal control scheme using an Enhanced Adaptive Policy Iteration (EAPI) algorithm with an adaptive termination mechanism. This scheme enables follower agents to achieve bipartite formation tracking of the leader while optimizing the performance index. Firstly, the definition of the optimal bipartite formation tracking control problem is presented, and the Bellman form of the optimal value function and control law is derived based on the coupled Hamilton-Jacobi-Bellman (HJB) equations. Then, an EAPI algorithm with an adaptive termination mechanism is introduced, which could avoid repeated iterations by setting a termination threshold, thus reducing the computational cost and decreasing the running time without reducing the control performance. Furthermore, the stability, convergence, and optimality of EAPI algorithm are analyzed. Moreover, the optimal control law is approximated and solved through the reinforcement learning framework. Finally, numerical results are conducted to verify the effectiveness of the proposed EAPI-based optimal bipartite formation tracking scheme for NASs.
期刊介绍:
With a focus on research in artificial intelligence and neural networks, this journal addresses issues involving solutions of real-life manufacturing, defense, management, government and industrial problems which are too complex to be solved through conventional approaches and require the simulation of intelligent thought processes, heuristics, applications of knowledge, and distributed and parallel processing. The integration of these multiple approaches in solving complex problems is of particular importance.
The journal presents new and original research and technological developments, addressing real and complex issues applicable to difficult problems. It provides a medium for exchanging scientific research and technological achievements accomplished by the international community.