A collaborative siege method of multiple unmanned vehicles based on reinforcement learning

Intelligence & Robotics Pub Date : 2024-02-29 DOI:10.20517/ir.2024.03

Muqing Su, Ruimin Pu, Yin Wang, Meng Yu

{"title":"A collaborative siege method of multiple unmanned vehicles based on reinforcement learning","authors":"Muqing Su, Ruimin Pu, Yin Wang, Meng Yu","doi":"10.20517/ir.2024.03","DOIUrl":null,"url":null,"abstract":"A method based on multi-agent reinforcement learning is proposed to tackle the challenges to capture escaping Target by Unmanned Ground Vehicles (UGVs). Initially, this study introduces environment and motion models tailored for cooperative UGV capture, along with clearly defined success criteria for direct capture. An attention mechanism integrated into the Soft Actor-Critic (SAC) is leveraged, directing focus towards pivotal state features pertinent to the task while effectively managing less relevant aspects. This allows capturing agents to concentrate on the whereabouts and activities of the target agent, thereby enhancing coordination and collaboration during pursuit. This focus on the target agent aids in refining the capture process and ensures precise estimation of value functions. The reduction in superfluous activities and unproductive scenarios amplifies efficiency and robustness. Furthermore, the attention weights dynamically adapt to environmental shifts. To address constrained incentives arising in scenarios with multiple vehicles capturing targets, the study introduces a revamped reward system. It divides the reward function into individual and cooperative components, thereby optimizing both global and localized incentives. By facilitating cooperative collaboration among capturing UGVs, this approach curtails the action space of the target UGV, leading to successful capture outcomes. The proposed technique demonstrates enhanced capture success compared to previous SAC algorithms. Simulation trials and comparisons with alternative learning methodologies validate the effectiveness of the algorithm and the design approach of the reward function.","PeriodicalId":426514,"journal":{"name":"Intelligence & Robotics","volume":"15 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-02-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligence & Robotics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.20517/ir.2024.03","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

A method based on multi-agent reinforcement learning is proposed to tackle the challenges to capture escaping Target by Unmanned Ground Vehicles (UGVs). Initially, this study introduces environment and motion models tailored for cooperative UGV capture, along with clearly defined success criteria for direct capture. An attention mechanism integrated into the Soft Actor-Critic (SAC) is leveraged, directing focus towards pivotal state features pertinent to the task while effectively managing less relevant aspects. This allows capturing agents to concentrate on the whereabouts and activities of the target agent, thereby enhancing coordination and collaboration during pursuit. This focus on the target agent aids in refining the capture process and ensures precise estimation of value functions. The reduction in superfluous activities and unproductive scenarios amplifies efficiency and robustness. Furthermore, the attention weights dynamically adapt to environmental shifts. To address constrained incentives arising in scenarios with multiple vehicles capturing targets, the study introduces a revamped reward system. It divides the reward function into individual and cooperative components, thereby optimizing both global and localized incentives. By facilitating cooperative collaboration among capturing UGVs, this approach curtails the action space of the target UGV, leading to successful capture outcomes. The proposed technique demonstrates enhanced capture success compared to previous SAC algorithms. Simulation trials and comparisons with alternative learning methodologies validate the effectiveness of the algorithm and the design approach of the reward function.

查看原文本刊更多论文

基于强化学习的多无人车协同围攻方法

本研究提出了一种基于多代理强化学习的方法，以应对无人地面飞行器（UGV）捕捉逃跑目标所面临的挑战。首先，本研究引入了专为合作捕获 UGV 而定制的环境和运动模型，以及明确定义的直接捕获成功标准。利用集成到软演员评论器（SAC）中的注意力机制，将注意力引向与任务相关的关键状态特征，同时有效管理不那么相关的方面。这样，捕获代理就能专注于目标代理的行踪和活动，从而加强追捕过程中的协调与合作。这种对目标特工的关注有助于完善捕获过程，并确保精确估算价值函数。减少了多余的活动和无益的场景，提高了效率和鲁棒性。此外，注意力权重还能动态适应环境变化。为了解决在多辆车捕获目标的情况下出现的激励约束问题，本研究引入了一个经过改进的奖励系统。它将奖励函数分为个人和合作两个部分，从而优化了全局和局部奖励。通过促进捕获 UGV 之间的合作，这种方法缩小了目标 UGV 的行动空间，从而成功捕获目标。与之前的 SAC 算法相比，所提出的技术提高了捕获成功率。模拟试验以及与其他学习方法的比较验证了该算法的有效性以及奖励函数的设计方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Intelligence & Robotics

自引率

0.00%

发文量