Yunxiao Guo , Dan Xu , Chang Wang , Jinxi Li , Han Long
{"title":"An invulnerable leader–follower collision-free unmanned aerial vehicle flocking system with attention-based Multi-Agent Reinforcement Learning","authors":"Yunxiao Guo , Dan Xu , Chang Wang , Jinxi Li , Han Long","doi":"10.1016/j.engappai.2025.111797","DOIUrl":null,"url":null,"abstract":"<div><div>Deep reinforcement learning has been proved useful for the flocking control of Unmanned Aerial Vehicle (UAV) swarm with the leader–followers topology. However, it remains unclear how to fully utilize the spatial information among the followers to alleviate the problems of sparse reward and policy convergence. In this article, we propose a novel multi-agent reinforcement learning-based fixed-wing UAV flocking approach named Attention Based Cucker–Smale (ABCS) Flocking to learn collision-free leader–follower flocking by utilizing the information among followers with the attention mechanism. Specifically, we design an explainable Cucker–Smale criterion-based flocking reward named ABCS reward to associate the followers with high efficiency in flocking. Then, a leader-guide attention mechanism is proposed by transferring the difference between leader and follower as the weights to support the follower selectively utilizing the followers’ information. As a result, we prove that an optimal state can be achieved so that each follower can keep an optimal distance from the other followers when the ABCS reward is maximized. In addition, we prove the ABCS reward is bounded, which can be used to indicate learning convergence. To improve the invulnerability, we proposes a leader selection method based on ABCS flocking, which can effectively select a new leader when the old leader is destroyed. Finally, we demonstrate the effectiveness of ABCS Flocking over the Multi-Agent Deep Deterministic Policy Gradient approach using various reward functions with various numbers of followers in the obstacles and leader-destroyed scenarios. The code is published on github <span><span>https://github.com/YunxiaoGuo/ABCS-Flocking</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"160 ","pages":"Article 111797"},"PeriodicalIF":8.0000,"publicationDate":"2025-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952197625017993","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Deep reinforcement learning has been proved useful for the flocking control of Unmanned Aerial Vehicle (UAV) swarm with the leader–followers topology. However, it remains unclear how to fully utilize the spatial information among the followers to alleviate the problems of sparse reward and policy convergence. In this article, we propose a novel multi-agent reinforcement learning-based fixed-wing UAV flocking approach named Attention Based Cucker–Smale (ABCS) Flocking to learn collision-free leader–follower flocking by utilizing the information among followers with the attention mechanism. Specifically, we design an explainable Cucker–Smale criterion-based flocking reward named ABCS reward to associate the followers with high efficiency in flocking. Then, a leader-guide attention mechanism is proposed by transferring the difference between leader and follower as the weights to support the follower selectively utilizing the followers’ information. As a result, we prove that an optimal state can be achieved so that each follower can keep an optimal distance from the other followers when the ABCS reward is maximized. In addition, we prove the ABCS reward is bounded, which can be used to indicate learning convergence. To improve the invulnerability, we proposes a leader selection method based on ABCS flocking, which can effectively select a new leader when the old leader is destroyed. Finally, we demonstrate the effectiveness of ABCS Flocking over the Multi-Agent Deep Deterministic Policy Gradient approach using various reward functions with various numbers of followers in the obstacles and leader-destroyed scenarios. The code is published on github https://github.com/YunxiaoGuo/ABCS-Flocking.
期刊介绍:
Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.