空中深度搜索:搜索任务的分布式多智能体深度强化学习

Vidyasagar Sadhu, Chuanneng Sun, Arman Karimian, Roberto Tron, D. Pompili
{"title":"空中深度搜索:搜索任务的分布式多智能体深度强化学习","authors":"Vidyasagar Sadhu, Chuanneng Sun, Arman Karimian, Roberto Tron, D. Pompili","doi":"10.1109/MASS50613.2020.00030","DOIUrl":null,"url":null,"abstract":"Search and Rescue (SAR) is an important part of several applications of national and social interest. Existing solutions for search missions in both terrestrial and aerial domains are mostly limited to single agent and specific environments; however, search missions can significantly benefit from the use of multiple agents that can quickly adapt to new environments. In this paper, we propose a framework based on Multi-Agent Deep Reinforcement Learning (MADRL) that realizes the actor-critic framework in a distributed manner for coordinating multiple Unmanned Aerial Vehicles (UAVs) in the exploration of unknown regions. One of the original aspects of our work is that the actors represent simulated or actual UAVs exploring the environment in parallel instead of traditional computer threads. Also, we propose addition of Long Short Term Memory (LSTM) neural network layers to the actor and critic architectures to handle imperfect communication and partial observability scenarios. The proposed approach has been evaluated in a grid world and has been compared against other competing algorithms such as Multi-Agent Q-Learning, Multi-Agent Deep Q-Learning to show its advantages. More generally, our approach could be extended to image-based/continuous action space environments as well.","PeriodicalId":105795,"journal":{"name":"2020 IEEE 17th International Conference on Mobile Ad Hoc and Sensor Systems (MASS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Aerial-DeepSearch: Distributed Multi-Agent Deep Reinforcement Learning for Search Missions\",\"authors\":\"Vidyasagar Sadhu, Chuanneng Sun, Arman Karimian, Roberto Tron, D. Pompili\",\"doi\":\"10.1109/MASS50613.2020.00030\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Search and Rescue (SAR) is an important part of several applications of national and social interest. Existing solutions for search missions in both terrestrial and aerial domains are mostly limited to single agent and specific environments; however, search missions can significantly benefit from the use of multiple agents that can quickly adapt to new environments. In this paper, we propose a framework based on Multi-Agent Deep Reinforcement Learning (MADRL) that realizes the actor-critic framework in a distributed manner for coordinating multiple Unmanned Aerial Vehicles (UAVs) in the exploration of unknown regions. One of the original aspects of our work is that the actors represent simulated or actual UAVs exploring the environment in parallel instead of traditional computer threads. Also, we propose addition of Long Short Term Memory (LSTM) neural network layers to the actor and critic architectures to handle imperfect communication and partial observability scenarios. The proposed approach has been evaluated in a grid world and has been compared against other competing algorithms such as Multi-Agent Q-Learning, Multi-Agent Deep Q-Learning to show its advantages. More generally, our approach could be extended to image-based/continuous action space environments as well.\",\"PeriodicalId\":105795,\"journal\":{\"name\":\"2020 IEEE 17th International Conference on Mobile Ad Hoc and Sensor Systems (MASS)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE 17th International Conference on Mobile Ad Hoc and Sensor Systems (MASS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MASS50613.2020.00030\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 17th International Conference on Mobile Ad Hoc and Sensor Systems (MASS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MASS50613.2020.00030","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

摘要

搜索与救援(SAR)是涉及国家和社会利益的若干应用领域的重要组成部分。现有的地面和空中搜索任务解决方案大多局限于单个智能体和特定环境;然而,搜索任务可以显著受益于使用多个代理,这些代理可以快速适应新环境。在本文中,我们提出了一个基于多智能体深度强化学习(MADRL)的框架,该框架以分布式的方式实现了actor- critical框架,用于协调多架无人机(uav)在未知区域的探索。我们工作的一个原始方面是,演员代表模拟或实际的无人机并行探索环境,而不是传统的计算机线程。此外,我们建议在参与者和评论家架构中增加长短期记忆(LSTM)神经网络层,以处理不完美的通信和部分可观察性场景。该方法已在网格世界中进行了评估,并与其他竞争算法(如Multi-Agent Q-Learning, Multi-Agent Deep Q-Learning)进行了比较,以显示其优势。更一般地说,我们的方法也可以扩展到基于图像/连续的行动空间环境。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Aerial-DeepSearch: Distributed Multi-Agent Deep Reinforcement Learning for Search Missions
Search and Rescue (SAR) is an important part of several applications of national and social interest. Existing solutions for search missions in both terrestrial and aerial domains are mostly limited to single agent and specific environments; however, search missions can significantly benefit from the use of multiple agents that can quickly adapt to new environments. In this paper, we propose a framework based on Multi-Agent Deep Reinforcement Learning (MADRL) that realizes the actor-critic framework in a distributed manner for coordinating multiple Unmanned Aerial Vehicles (UAVs) in the exploration of unknown regions. One of the original aspects of our work is that the actors represent simulated or actual UAVs exploring the environment in parallel instead of traditional computer threads. Also, we propose addition of Long Short Term Memory (LSTM) neural network layers to the actor and critic architectures to handle imperfect communication and partial observability scenarios. The proposed approach has been evaluated in a grid world and has been compared against other competing algorithms such as Multi-Agent Q-Learning, Multi-Agent Deep Q-Learning to show its advantages. More generally, our approach could be extended to image-based/continuous action space environments as well.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信