{"title":"Hybrid Transformer Based Multi-Agent Reinforcement Learning for Multiple Unpiloted Aerial Vehicle Coordination in Air Corridors","authors":"Liangkun Yu;Zhirun Li;Nirwan Ansari;Xiang Sun","doi":"10.1109/TMC.2025.3532204","DOIUrl":null,"url":null,"abstract":"Advanced Air Mobility (AAM) seeks to establish a next-generation air transportation system by leveraging autonomous unpiloted aerial vehicles (UAVs) to transport passengers and cargo between locations previously underserved or unserved by traditional aviation. Achieving AAM at scale requires overcoming significant challenges in airspace management, classification, and traffic control to safely accommodate the increasing volume of UAV operations. This paper presents a comprehensive design for air corridors to facilitate efficient aerial transport and formulates a multi-UAV coordination problem within these corridors. The objective is to enable each UAV to autonomously make control decisions based on local observations gathered from onboard sensors. This decentralized control approach is modeled as a multi-agent partially observable Markov decision process (POMDP), aiming at minimizing UAV travel time while ensuring adherence to corridor boundaries and collision avoidance. To address the complexities posed by varying state dimensions and types, we propose a novel Hybrid Transformer-based Multi-agent Reinforcement Learning (HTransRL) architecture. HTransRL integrates a customized transformer model into an actor-critic network, effectively processing both sequential and non-sequential observed states of varying sizes while capturing their correlations. This enables safe and efficient UAV navigation. Simulation results show that in test environments similar to or simpler than training scenarios, HTransRL achieves a successful arrival rate exceeding 90% in worst-case test scenarios. In test environments more complex than training scenarios, HTransRL demonstrates superior scalability compared to two baseline methods, achieving higher arrival rates and comparable travel times.","PeriodicalId":50389,"journal":{"name":"IEEE Transactions on Mobile Computing","volume":"24 6","pages":"5482-5495"},"PeriodicalIF":7.7000,"publicationDate":"2025-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Mobile Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10848344/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Advanced Air Mobility (AAM) seeks to establish a next-generation air transportation system by leveraging autonomous unpiloted aerial vehicles (UAVs) to transport passengers and cargo between locations previously underserved or unserved by traditional aviation. Achieving AAM at scale requires overcoming significant challenges in airspace management, classification, and traffic control to safely accommodate the increasing volume of UAV operations. This paper presents a comprehensive design for air corridors to facilitate efficient aerial transport and formulates a multi-UAV coordination problem within these corridors. The objective is to enable each UAV to autonomously make control decisions based on local observations gathered from onboard sensors. This decentralized control approach is modeled as a multi-agent partially observable Markov decision process (POMDP), aiming at minimizing UAV travel time while ensuring adherence to corridor boundaries and collision avoidance. To address the complexities posed by varying state dimensions and types, we propose a novel Hybrid Transformer-based Multi-agent Reinforcement Learning (HTransRL) architecture. HTransRL integrates a customized transformer model into an actor-critic network, effectively processing both sequential and non-sequential observed states of varying sizes while capturing their correlations. This enables safe and efficient UAV navigation. Simulation results show that in test environments similar to or simpler than training scenarios, HTransRL achieves a successful arrival rate exceeding 90% in worst-case test scenarios. In test environments more complex than training scenarios, HTransRL demonstrates superior scalability compared to two baseline methods, achieving higher arrival rates and comparable travel times.
期刊介绍:
IEEE Transactions on Mobile Computing addresses key technical issues related to various aspects of mobile computing. This includes (a) architectures, (b) support services, (c) algorithm/protocol design and analysis, (d) mobile environments, (e) mobile communication systems, (f) applications, and (g) emerging technologies. Topics of interest span a wide range, covering aspects like mobile networks and hosts, mobility management, multimedia, operating system support, power management, online and mobile environments, security, scalability, reliability, and emerging technologies such as wearable computers, body area networks, and wireless sensor networks. The journal serves as a comprehensive platform for advancements in mobile computing research.