Optimizing UAV-UGV coalition operations: A hybrid clustering and multi-agent reinforcement learning approach for path planning in obstructed environment

IF 4.4 3区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Ad Hoc Networks Pub Date : 2024-04-17 DOI:10.1016/j.adhoc.2024.103519

Shamyo Brotee , Farhan Kabir , Md. Abdur Razzaque , Palash Roy , Md. Mamun-Or-Rashid , Md. Rafiul Hassan , Mohammad Mehedi Hassan

{"title":"Optimizing UAV-UGV coalition operations: A hybrid clustering and multi-agent reinforcement learning approach for path planning in obstructed environment","authors":"Shamyo Brotee , Farhan Kabir , Md. Abdur Razzaque , Palash Roy , Md. Mamun-Or-Rashid , Md. Rafiul Hassan , Mohammad Mehedi Hassan","doi":"10.1016/j.adhoc.2024.103519","DOIUrl":null,"url":null,"abstract":"<div><p>One of the most critical applications undertaken by Unmanned Aerial Vehicles (UAVs) and Unmanned Ground Vehicles (UGVs) is reaching predefined targets by following the most time-efficient routes while avoiding collisions. Unfortunately, UAVs are hampered by limited battery life, and UGVs face challenges in reachability due to obstacles and elevation variations, which is why a coalition of UAVs and UGVs can be highly effective. Existing literature primarily focuses on one-to-one coalitions, which constrains the efficiency of reaching targets. In this work, we introduce a novel approach for a UAV-UGV coalition with a variable number of vehicles, employing a modified mean-shift clustering algorithm (MEANCRFT) to segment targets into multiple zones. This approach of assigning targets to various circular zones based on density and range significantly reduces the time required to reach these targets. Moreover, introducing variability in the number of UAVs and UGVs in a coalition enhances task efficiency by enabling simultaneous multi-target engagement. In our approach, each vehicle of the coalitions is trained using two advanced deep reinforcement learning algorithms in two separate experiments, namely Multi-agent Deep Deterministic Policy Gradient (MADDPG) and Multi-agent Proximal Policy Optimization (MAPPO). The results of our experimental evaluation demonstrate that the proposed MEANCRFT-MADDPG method substantially surpasses current state-of-the-art techniques, nearly doubling efficiency in terms of target navigation time and task completion rate.</p></div>","PeriodicalId":55555,"journal":{"name":"Ad Hoc Networks","volume":null,"pages":null},"PeriodicalIF":4.4000,"publicationDate":"2024-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ad Hoc Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1570870524001306","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

One of the most critical applications undertaken by Unmanned Aerial Vehicles (UAVs) and Unmanned Ground Vehicles (UGVs) is reaching predefined targets by following the most time-efficient routes while avoiding collisions. Unfortunately, UAVs are hampered by limited battery life, and UGVs face challenges in reachability due to obstacles and elevation variations, which is why a coalition of UAVs and UGVs can be highly effective. Existing literature primarily focuses on one-to-one coalitions, which constrains the efficiency of reaching targets. In this work, we introduce a novel approach for a UAV-UGV coalition with a variable number of vehicles, employing a modified mean-shift clustering algorithm (MEANCRFT) to segment targets into multiple zones. This approach of assigning targets to various circular zones based on density and range significantly reduces the time required to reach these targets. Moreover, introducing variability in the number of UAVs and UGVs in a coalition enhances task efficiency by enabling simultaneous multi-target engagement. In our approach, each vehicle of the coalitions is trained using two advanced deep reinforcement learning algorithms in two separate experiments, namely Multi-agent Deep Deterministic Policy Gradient (MADDPG) and Multi-agent Proximal Policy Optimization (MAPPO). The results of our experimental evaluation demonstrate that the proposed MEANCRFT-MADDPG method substantially surpasses current state-of-the-art techniques, nearly doubling efficiency in terms of target navigation time and task completion rate.

查看原文本刊更多论文

优化 UAV-UGV 联军行动：用于障碍环境中路径规划的混合聚类和多代理强化学习方法

无人驾驶飞行器（UAV）和无人驾驶地面飞行器（UGV）最关键的应用之一是按照最省时的路线到达预定目标，同时避免碰撞。遗憾的是，UAV 受限于有限的电池寿命，而 UGV 则面临着障碍物和海拔高度变化带来的可到达性挑战，这就是为什么 UAV 和 UGV 的联合能够发挥巨大作用。现有文献主要关注一对一联盟，这限制了到达目标的效率。在这项工作中，我们为车辆数量可变的无人机-无人潜航器联盟引入了一种新方法，即采用改进的均值移动聚类算法（MEANCRFT）将目标划分为多个区域。这种根据密度和距离将目标分配到不同圆形区域的方法大大缩短了到达这些目标所需的时间。此外，引入联盟中 UAV 和 UGV 数量的可变性，可同时与多目标交战，从而提高任务效率。在我们的方法中，联盟中的每个飞行器都在两个独立的实验中使用两种先进的深度强化学习算法进行训练，即多代理深度确定性策略梯度（MADDPG）和多代理近端策略优化（MAPPO）。我们的实验评估结果表明，所提出的 MEANCRFT-MADDPG 方法大大超越了目前最先进的技术，在目标导航时间和任务完成率方面的效率几乎翻了一番。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Ad Hoc Networks 工程技术-电信学

CiteScore

10.20

自引率

4.20%

发文量

131

审稿时长

4.8 months

期刊介绍： The Ad Hoc Networks is an international and archival journal providing a publication vehicle for complete coverage of all topics of interest to those involved in ad hoc and sensor networking areas. The Ad Hoc Networks considers original, high quality and unpublished contributions addressing all aspects of ad hoc and sensor networks. Specific areas of interest include, but are not limited to: Mobile and Wireless Ad Hoc Networks Sensor Networks Wireless Local and Personal Area Networks Home Networks Ad Hoc Networks of Autonomous Intelligent Systems Novel Architectures for Ad Hoc and Sensor Networks Self-organizing Network Architectures and Protocols Transport Layer Protocols Routing protocols (unicast, multicast, geocast, etc.) Media Access Control Techniques Error Control Schemes Power-Aware, Low-Power and Energy-Efficient Designs Synchronization and Scheduling Issues Mobility Management Mobility-Tolerant Communication Protocols Location Tracking and Location-based Services Resource and Information Management Security and Fault-Tolerance Issues Hardware and Software Platforms, Systems, and Testbeds Experimental and Prototype Results Quality-of-Service Issues Cross-Layer Interactions Scalability Issues Performance Analysis and Simulation of Protocols.

文献相关原料

公司名称	产品信息	采购帮参考价格