Jinsheng Xiao;Bolun Yan;Honggang Xie;Qiuze Yu;Linkun Li;Yuan-Fang Wang
{"title":"Optimizing Multi-AAV Formation Cooperative Control Strategies With MCDDPG Approach","authors":"Jinsheng Xiao;Bolun Yan;Honggang Xie;Qiuze Yu;Linkun Li;Yuan-Fang Wang","doi":"10.1109/JIOT.2024.3508820","DOIUrl":null,"url":null,"abstract":"In recent years, the application of autonomous aerial vehicle (AAV) devices in military, industrial, and civilian sectors has become increasingly widespread. Consequently, research on multi-AAV formation cooperative control strategies has garnered significant attention. However, current multiagent reinforcement learning algorithms often struggle with unguided exploration, making it challenging for agents to develop efficient action strategies for complex collaborative tasks. To address this issue, this article introduces a multicritic deep deterministic policy gradient (MCDDPG) algorithm. This algorithm designs a multicritic (MC) structure based on the DDPG algorithm. This structure guides AAVs using physical models for tracking and obstacle avoidance, while deep learning models are employed to facilitate cooperative coordination among AAVs. Furthermore, to address the weight allocation issue among different Critic modules in the MC structure, a dynamic difficulty priority weight optimization algorithm is implemented. This enhances the algorithm’s collaborative capabilities. To validate the collaborative planning capability of the proposed algorithm, a simulation scenario involving multicoupled tasks is designed in the multiagent particle environment (MPE). In this scenario, the MCDDPG algorithm demonstrates the fastest convergence speed and the optimal collaborative strategy, outperforming other state-of-the-art multiagent deep reinforcement learning (MADRL) algorithms currently in use.","PeriodicalId":54347,"journal":{"name":"IEEE Internet of Things Journal","volume":"12 8","pages":"9775-9791"},"PeriodicalIF":8.2000,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Internet of Things Journal","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10958186/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
In recent years, the application of autonomous aerial vehicle (AAV) devices in military, industrial, and civilian sectors has become increasingly widespread. Consequently, research on multi-AAV formation cooperative control strategies has garnered significant attention. However, current multiagent reinforcement learning algorithms often struggle with unguided exploration, making it challenging for agents to develop efficient action strategies for complex collaborative tasks. To address this issue, this article introduces a multicritic deep deterministic policy gradient (MCDDPG) algorithm. This algorithm designs a multicritic (MC) structure based on the DDPG algorithm. This structure guides AAVs using physical models for tracking and obstacle avoidance, while deep learning models are employed to facilitate cooperative coordination among AAVs. Furthermore, to address the weight allocation issue among different Critic modules in the MC structure, a dynamic difficulty priority weight optimization algorithm is implemented. This enhances the algorithm’s collaborative capabilities. To validate the collaborative planning capability of the proposed algorithm, a simulation scenario involving multicoupled tasks is designed in the multiagent particle environment (MPE). In this scenario, the MCDDPG algorithm demonstrates the fastest convergence speed and the optimal collaborative strategy, outperforming other state-of-the-art multiagent deep reinforcement learning (MADRL) algorithms currently in use.
期刊介绍:
The EEE Internet of Things (IoT) Journal publishes articles and review articles covering various aspects of IoT, including IoT system architecture, IoT enabling technologies, IoT communication and networking protocols such as network coding, and IoT services and applications. Topics encompass IoT's impacts on sensor technologies, big data management, and future internet design for applications like smart cities and smart homes. Fields of interest include IoT architecture such as things-centric, data-centric, service-oriented IoT architecture; IoT enabling technologies and systematic integration such as sensor technologies, big sensor data management, and future Internet design for IoT; IoT services, applications, and test-beds such as IoT service middleware, IoT application programming interface (API), IoT application design, and IoT trials/experiments; IoT standardization activities and technology development in different standard development organizations (SDO) such as IEEE, IETF, ITU, 3GPP, ETSI, etc.