AdverSAR: Adversarial Search and Rescue via Multi-Agent Reinforcement Learning

2022 IEEE International Symposium on Technologies for Homeland Security (HST) Pub Date : 2022-11-14 DOI:10.1109/HST56032.2022.10025434

A. Rahman, Arnab Bhattacharya, Thiagarajan Ramachandran, Sayak Mukherjee, Himanshu Sharma, Ted Fujimoto, Samrat Chatterjee

{"title":"AdverSAR: Adversarial Search and Rescue via Multi-Agent Reinforcement Learning","authors":"A. Rahman, Arnab Bhattacharya, Thiagarajan Ramachandran, Sayak Mukherjee, Himanshu Sharma, Ted Fujimoto, Samrat Chatterjee","doi":"10.1109/HST56032.2022.10025434","DOIUrl":null,"url":null,"abstract":"Search and Rescue (SAR) missions in remote environments often employ autonomous multi-robot systems that learn, plan, and execute a combination of local single-robot control actions, group primitives, and global mission-oriented coordination and collaboration. Often, SAR coordination strategies are manually designed by human experts who can remotely control the multi-robot system and enable semi-autonomous operations. However, in remote environments where connectivity is limited and human intervention is often not possible, decentralized collaboration strategies are needed for fully-autonomous operations. Nevertheless, decentralized coordination may be ineffective in adversarial environments due to sensor noise, actuation faults, or manipulation of inter-agent communication data. In this paper, we propose an algorithmic approach based on adversarial multi-agent reinforcement learning (MARL) that allows robots to efficiently coordinate their strategies in the presence of adversarial inter-agent communications. In our setup, the objective of the multi-robot team is to discover targets strategically in an obstacle-strewn geographical area by minimizing the average time needed to find the targets. It is assumed that the robots have no prior knowledge of the target locations, and they can interact with only a subset of neighboring robots at any time. Based on the centralized training with decentralized execution (CTDE) paradigm in MARL, we utilize a hierarchical meta-learning framework to learn dynamic team-coordination modalities and discover emergent team behavior under complex cooperative-competitive scenarios. The effectiveness of our approach is demonstrated on a collection of prototype grid-world environments with different specifications of benign and adversarial agents, target locations, and agent rewards.","PeriodicalId":162426,"journal":{"name":"2022 IEEE International Symposium on Technologies for Homeland Security (HST)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Symposium on Technologies for Homeland Security (HST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HST56032.2022.10025434","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

Search and Rescue (SAR) missions in remote environments often employ autonomous multi-robot systems that learn, plan, and execute a combination of local single-robot control actions, group primitives, and global mission-oriented coordination and collaboration. Often, SAR coordination strategies are manually designed by human experts who can remotely control the multi-robot system and enable semi-autonomous operations. However, in remote environments where connectivity is limited and human intervention is often not possible, decentralized collaboration strategies are needed for fully-autonomous operations. Nevertheless, decentralized coordination may be ineffective in adversarial environments due to sensor noise, actuation faults, or manipulation of inter-agent communication data. In this paper, we propose an algorithmic approach based on adversarial multi-agent reinforcement learning (MARL) that allows robots to efficiently coordinate their strategies in the presence of adversarial inter-agent communications. In our setup, the objective of the multi-robot team is to discover targets strategically in an obstacle-strewn geographical area by minimizing the average time needed to find the targets. It is assumed that the robots have no prior knowledge of the target locations, and they can interact with only a subset of neighboring robots at any time. Based on the centralized training with decentralized execution (CTDE) paradigm in MARL, we utilize a hierarchical meta-learning framework to learn dynamic team-coordination modalities and discover emergent team behavior under complex cooperative-competitive scenarios. The effectiveness of our approach is demonstrated on a collection of prototype grid-world environments with different specifications of benign and adversarial agents, target locations, and agent rewards.

查看原文本刊更多论文

AdverSAR:基于多智能体强化学习的对抗性搜索和救援

远程环境中的搜索和救援(SAR)任务通常采用自主多机器人系统，该系统学习、计划和执行本地单机器人控制动作、组原语和面向任务的全局协调与协作的组合。通常，SAR协调策略是由人类专家手动设计的，他们可以远程控制多机器人系统并实现半自主操作。然而，在连接有限且通常不可能进行人为干预的远程环境中，需要分散的协作策略来实现完全自主的操作。然而，由于传感器噪声、驱动故障或对代理间通信数据的操纵，分散协调在敌对环境中可能无效。在本文中，我们提出了一种基于对抗性多智能体强化学习(MARL)的算法方法，该方法允许机器人在存在对抗性智能体间通信的情况下有效地协调其策略。在我们的设置中，多机器人团队的目标是通过最小化寻找目标所需的平均时间，在布满障碍物的地理区域中战略性地发现目标。假设机器人对目标位置没有先验知识，并且它们在任何时候只能与相邻机器人的一个子集进行交互。基于MARL中的集中训练与分散执行(CTDE)范式，我们利用分层元学习框架来学习动态团队协调模式，并发现复杂合作-竞争情境下的突发团队行为。我们的方法的有效性在一系列原型网格世界环境中得到了证明，这些环境具有不同规格的良性和对抗性代理、目标位置和代理奖励。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 IEEE International Symposium on Technologies for Homeland Security (HST)

自引率

0.00%

发文量