Optimizing UAV-UGV coalition operations: A hybrid clustering and multi-agent reinforcement learning approach for path planning in obstructed environment
IF 4.4 3区 计算机科学Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
{"title":"Optimizing UAV-UGV coalition operations: A hybrid clustering and multi-agent reinforcement learning approach for path planning in obstructed environment","authors":"Shamyo Brotee , Farhan Kabir , Md. Abdur Razzaque , Palash Roy , Md. Mamun-Or-Rashid , Md. Rafiul Hassan , Mohammad Mehedi Hassan","doi":"10.1016/j.adhoc.2024.103519","DOIUrl":null,"url":null,"abstract":"<div><p>One of the most critical applications undertaken by Unmanned Aerial Vehicles (UAVs) and Unmanned Ground Vehicles (UGVs) is reaching predefined targets by following the most time-efficient routes while avoiding collisions. Unfortunately, UAVs are hampered by limited battery life, and UGVs face challenges in reachability due to obstacles and elevation variations, which is why a coalition of UAVs and UGVs can be highly effective. Existing literature primarily focuses on one-to-one coalitions, which constrains the efficiency of reaching targets. In this work, we introduce a novel approach for a UAV-UGV coalition with a variable number of vehicles, employing a modified mean-shift clustering algorithm (MEANCRFT) to segment targets into multiple zones. This approach of assigning targets to various circular zones based on density and range significantly reduces the time required to reach these targets. Moreover, introducing variability in the number of UAVs and UGVs in a coalition enhances task efficiency by enabling simultaneous multi-target engagement. In our approach, each vehicle of the coalitions is trained using two advanced deep reinforcement learning algorithms in two separate experiments, namely Multi-agent Deep Deterministic Policy Gradient (MADDPG) and Multi-agent Proximal Policy Optimization (MAPPO). The results of our experimental evaluation demonstrate that the proposed MEANCRFT-MADDPG method substantially surpasses current state-of-the-art techniques, nearly doubling efficiency in terms of target navigation time and task completion rate.</p></div>","PeriodicalId":55555,"journal":{"name":"Ad Hoc Networks","volume":null,"pages":null},"PeriodicalIF":4.4000,"publicationDate":"2024-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ad Hoc Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1570870524001306","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
One of the most critical applications undertaken by Unmanned Aerial Vehicles (UAVs) and Unmanned Ground Vehicles (UGVs) is reaching predefined targets by following the most time-efficient routes while avoiding collisions. Unfortunately, UAVs are hampered by limited battery life, and UGVs face challenges in reachability due to obstacles and elevation variations, which is why a coalition of UAVs and UGVs can be highly effective. Existing literature primarily focuses on one-to-one coalitions, which constrains the efficiency of reaching targets. In this work, we introduce a novel approach for a UAV-UGV coalition with a variable number of vehicles, employing a modified mean-shift clustering algorithm (MEANCRFT) to segment targets into multiple zones. This approach of assigning targets to various circular zones based on density and range significantly reduces the time required to reach these targets. Moreover, introducing variability in the number of UAVs and UGVs in a coalition enhances task efficiency by enabling simultaneous multi-target engagement. In our approach, each vehicle of the coalitions is trained using two advanced deep reinforcement learning algorithms in two separate experiments, namely Multi-agent Deep Deterministic Policy Gradient (MADDPG) and Multi-agent Proximal Policy Optimization (MAPPO). The results of our experimental evaluation demonstrate that the proposed MEANCRFT-MADDPG method substantially surpasses current state-of-the-art techniques, nearly doubling efficiency in terms of target navigation time and task completion rate.
期刊介绍:
The Ad Hoc Networks is an international and archival journal providing a publication vehicle for complete coverage of all topics of interest to those involved in ad hoc and sensor networking areas. The Ad Hoc Networks considers original, high quality and unpublished contributions addressing all aspects of ad hoc and sensor networks. Specific areas of interest include, but are not limited to:
Mobile and Wireless Ad Hoc Networks
Sensor Networks
Wireless Local and Personal Area Networks
Home Networks
Ad Hoc Networks of Autonomous Intelligent Systems
Novel Architectures for Ad Hoc and Sensor Networks
Self-organizing Network Architectures and Protocols
Transport Layer Protocols
Routing protocols (unicast, multicast, geocast, etc.)
Media Access Control Techniques
Error Control Schemes
Power-Aware, Low-Power and Energy-Efficient Designs
Synchronization and Scheduling Issues
Mobility Management
Mobility-Tolerant Communication Protocols
Location Tracking and Location-based Services
Resource and Information Management
Security and Fault-Tolerance Issues
Hardware and Software Platforms, Systems, and Testbeds
Experimental and Prototype Results
Quality-of-Service Issues
Cross-Layer Interactions
Scalability Issues
Performance Analysis and Simulation of Protocols.