Hybrid Transformer Based Multi-Agent Reinforcement Learning for Multiple Unpiloted Aerial Vehicle Coordination in Air Corridors

IF 7.7 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Transactions on Mobile Computing Pub Date : 2025-01-21 DOI:10.1109/TMC.2025.3532204

Liangkun Yu;Zhirun Li;Nirwan Ansari;Xiang Sun

{"title":"Hybrid Transformer Based Multi-Agent Reinforcement Learning for Multiple Unpiloted Aerial Vehicle Coordination in Air Corridors","authors":"Liangkun Yu;Zhirun Li;Nirwan Ansari;Xiang Sun","doi":"10.1109/TMC.2025.3532204","DOIUrl":null,"url":null,"abstract":"Advanced Air Mobility (AAM) seeks to establish a next-generation air transportation system by leveraging autonomous unpiloted aerial vehicles (UAVs) to transport passengers and cargo between locations previously underserved or unserved by traditional aviation. Achieving AAM at scale requires overcoming significant challenges in airspace management, classification, and traffic control to safely accommodate the increasing volume of UAV operations. This paper presents a comprehensive design for air corridors to facilitate efficient aerial transport and formulates a multi-UAV coordination problem within these corridors. The objective is to enable each UAV to autonomously make control decisions based on local observations gathered from onboard sensors. This decentralized control approach is modeled as a multi-agent partially observable Markov decision process (POMDP), aiming at minimizing UAV travel time while ensuring adherence to corridor boundaries and collision avoidance. To address the complexities posed by varying state dimensions and types, we propose a novel Hybrid Transformer-based Multi-agent Reinforcement Learning (HTransRL) architecture. HTransRL integrates a customized transformer model into an actor-critic network, effectively processing both sequential and non-sequential observed states of varying sizes while capturing their correlations. This enables safe and efficient UAV navigation. Simulation results show that in test environments similar to or simpler than training scenarios, HTransRL achieves a successful arrival rate exceeding 90% in worst-case test scenarios. In test environments more complex than training scenarios, HTransRL demonstrates superior scalability compared to two baseline methods, achieving higher arrival rates and comparable travel times.","PeriodicalId":50389,"journal":{"name":"IEEE Transactions on Mobile Computing","volume":"24 6","pages":"5482-5495"},"PeriodicalIF":7.7000,"publicationDate":"2025-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Mobile Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10848344/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Advanced Air Mobility (AAM) seeks to establish a next-generation air transportation system by leveraging autonomous unpiloted aerial vehicles (UAVs) to transport passengers and cargo between locations previously underserved or unserved by traditional aviation. Achieving AAM at scale requires overcoming significant challenges in airspace management, classification, and traffic control to safely accommodate the increasing volume of UAV operations. This paper presents a comprehensive design for air corridors to facilitate efficient aerial transport and formulates a multi-UAV coordination problem within these corridors. The objective is to enable each UAV to autonomously make control decisions based on local observations gathered from onboard sensors. This decentralized control approach is modeled as a multi-agent partially observable Markov decision process (POMDP), aiming at minimizing UAV travel time while ensuring adherence to corridor boundaries and collision avoidance. To address the complexities posed by varying state dimensions and types, we propose a novel Hybrid Transformer-based Multi-agent Reinforcement Learning (HTransRL) architecture. HTransRL integrates a customized transformer model into an actor-critic network, effectively processing both sequential and non-sequential observed states of varying sizes while capturing their correlations. This enables safe and efficient UAV navigation. Simulation results show that in test environments similar to or simpler than training scenarios, HTransRL achieves a successful arrival rate exceeding 90% in worst-case test scenarios. In test environments more complex than training scenarios, HTransRL demonstrates superior scalability compared to two baseline methods, achieving higher arrival rates and comparable travel times.

查看原文本刊更多论文

基于混合变压器的多智能体强化学习的空中走廊多无人机协调

先进空中机动（AAM）寻求建立下一代航空运输系统，利用自主无人驾驶飞行器（uav）在以前传统航空服务不足或未服务的地点之间运输乘客和货物。实现大规模的AAM需要克服空域管理、分类和交通管制方面的重大挑战，以安全地适应不断增加的无人机操作量。本文提出了一种促进高效空中运输的空中走廊综合设计方案，并提出了空中走廊内多无人机的协调问题。目标是使每架无人机能够根据机载传感器收集的本地观测数据自主地做出控制决策。这种分散控制方法被建模为多智能体部分可观察马尔可夫决策过程（POMDP），旨在最大限度地减少无人机的飞行时间，同时确保遵守走廊边界和避免碰撞。为了解决状态维度和类型变化带来的复杂性，我们提出了一种新的基于混合变压器的多智能体强化学习（HTransRL）架构。HTransRL将一个定制的变压器模型集成到一个actor-critic网络中，有效地处理不同大小的顺序和非顺序观察状态，同时捕获它们的相关性。这使得无人机导航安全高效。仿真结果表明，在与训练场景相似或更简单的测试环境中，HTransRL在最坏测试场景下的成功到达率超过90%。在比训练场景更复杂的测试环境中，与两种基线方法相比，HTransRL展示了优越的可扩展性，实现了更高的到达率和可比的旅行时间。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Mobile Computing 工程技术-电信学

CiteScore

12.90

自引率

2.50%

发文量

403

审稿时长

6.6 months

期刊介绍： IEEE Transactions on Mobile Computing addresses key technical issues related to various aspects of mobile computing. This includes (a) architectures, (b) support services, (c) algorithm/protocol design and analysis, (d) mobile environments, (e) mobile communication systems, (f) applications, and (g) emerging technologies. Topics of interest span a wide range, covering aspects like mobile networks and hosts, mobility management, multimedia, operating system support, power management, online and mobile environments, security, scalability, reliability, and emerging technologies such as wearable computers, body area networks, and wireless sensor networks. The journal serves as a comprehensive platform for advancements in mobile computing research.