An invulnerable leader–follower collision-free unmanned aerial vehicle flocking system with attention-based Multi-Agent Reinforcement Learning

IF 8 2区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

Engineering Applications of Artificial Intelligence Pub Date : 2025-08-25 DOI:10.1016/j.engappai.2025.111797

Yunxiao Guo , Dan Xu , Chang Wang , Jinxi Li , Han Long

{"title":"An invulnerable leader–follower collision-free unmanned aerial vehicle flocking system with attention-based Multi-Agent Reinforcement Learning","authors":"Yunxiao Guo , Dan Xu , Chang Wang , Jinxi Li , Han Long","doi":"10.1016/j.engappai.2025.111797","DOIUrl":null,"url":null,"abstract":"<div><div>Deep reinforcement learning has been proved useful for the flocking control of Unmanned Aerial Vehicle (UAV) swarm with the leader–followers topology. However, it remains unclear how to fully utilize the spatial information among the followers to alleviate the problems of sparse reward and policy convergence. In this article, we propose a novel multi-agent reinforcement learning-based fixed-wing UAV flocking approach named Attention Based Cucker–Smale (ABCS) Flocking to learn collision-free leader–follower flocking by utilizing the information among followers with the attention mechanism. Specifically, we design an explainable Cucker–Smale criterion-based flocking reward named ABCS reward to associate the followers with high efficiency in flocking. Then, a leader-guide attention mechanism is proposed by transferring the difference between leader and follower as the weights to support the follower selectively utilizing the followers’ information. As a result, we prove that an optimal state can be achieved so that each follower can keep an optimal distance from the other followers when the ABCS reward is maximized. In addition, we prove the ABCS reward is bounded, which can be used to indicate learning convergence. To improve the invulnerability, we proposes a leader selection method based on ABCS flocking, which can effectively select a new leader when the old leader is destroyed. Finally, we demonstrate the effectiveness of ABCS Flocking over the Multi-Agent Deep Deterministic Policy Gradient approach using various reward functions with various numbers of followers in the obstacles and leader-destroyed scenarios. The code is published on github <span><span>https://github.com/YunxiaoGuo/ABCS-Flocking</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"160 ","pages":"Article 111797"},"PeriodicalIF":8.0000,"publicationDate":"2025-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952197625017993","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Deep reinforcement learning has been proved useful for the flocking control of Unmanned Aerial Vehicle (UAV) swarm with the leader–followers topology. However, it remains unclear how to fully utilize the spatial information among the followers to alleviate the problems of sparse reward and policy convergence. In this article, we propose a novel multi-agent reinforcement learning-based fixed-wing UAV flocking approach named Attention Based Cucker–Smale (ABCS) Flocking to learn collision-free leader–follower flocking by utilizing the information among followers with the attention mechanism. Specifically, we design an explainable Cucker–Smale criterion-based flocking reward named ABCS reward to associate the followers with high efficiency in flocking. Then, a leader-guide attention mechanism is proposed by transferring the difference between leader and follower as the weights to support the follower selectively utilizing the followers’ information. As a result, we prove that an optimal state can be achieved so that each follower can keep an optimal distance from the other followers when the ABCS reward is maximized. In addition, we prove the ABCS reward is bounded, which can be used to indicate learning convergence. To improve the invulnerability, we proposes a leader selection method based on ABCS flocking, which can effectively select a new leader when the old leader is destroyed. Finally, we demonstrate the effectiveness of ABCS Flocking over the Multi-Agent Deep Deterministic Policy Gradient approach using various reward functions with various numbers of followers in the obstacles and leader-destroyed scenarios. The code is published on github https://github.com/YunxiaoGuo/ABCS-Flocking.

查看原文本刊更多论文

基于多智能体强化学习的无懈可击的leader-follower无碰撞无人机集群系统

深度强化学习已被证明对具有leader - follower拓扑结构的无人机（UAV）群的群集控制是有用的。然而，如何充分利用追随者之间的空间信息来缓解奖励稀疏和政策收敛的问题，目前还不清楚。本文提出了一种基于多智能体强化学习的固定翼无人机蜂群算法——基于注意力的cucker - small (ABCS) flocking，利用follower之间的信息和注意力机制来学习无碰撞的leader-follower蜂群。具体来说，我们设计了一个可解释的基于cucker - small准则的羊群奖励，即ABCS奖励，将追随者与高效率的羊群联系起来。然后，提出了一种以领导者和追随者之间的差异为权重的领导-引导注意机制，以支持追随者有选择地利用追随者的信息。结果证明，当ABCS奖励最大化时，可以达到一个最优状态，使每个follower与其他follower保持最优距离。此外，我们还证明了ABCS奖励是有界的，可以用来表示学习收敛性。为了提高机群的抗毁性，提出了一种基于ABCS集群的机群选择方法，该方法可以在旧机群被破坏时有效地选择新的机群。最后，我们在障碍和领导者被摧毁的情况下，使用不同数量的追随者的各种奖励函数，证明了ABCS群集在多智能体深度确定性策略梯度方法上的有效性。代码发布在github https://github.com/YunxiaoGuo/ABCS-Flocking上。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Engineering Applications of Artificial Intelligence 工程技术-工程：电子与电气

CiteScore

9.60

自引率

10.00%

发文量

505

审稿时长

68 days

期刊介绍： Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.