Balancing Efficiency and Unpredictability in Multi-robot Patrolling: A MARL-Based Approach

2023 IEEE International Conference on Robotics and Automation (ICRA) Pub Date : 2023-05-29 DOI:10.1109/ICRA48891.2023.10160923

Lingxiao Guo, Haoxuan Pan, Xiaoming Duan, Jianping He

{"title":"Balancing Efficiency and Unpredictability in Multi-robot Patrolling: A MARL-Based Approach","authors":"Lingxiao Guo, Haoxuan Pan, Xiaoming Duan, Jianping He","doi":"10.1109/ICRA48891.2023.10160923","DOIUrl":null,"url":null,"abstract":"Patrolling with multiple robots is a challenging task. While the robots collaboratively and repeatedly cover the regions of interest in the environment, their routes should satisfy two often conflicting properties: i) (efficiency) the time intervals between two consecutive visits to the regions are small; ii) (unpredictability) the patrolling trajectories are random and unpredictable. We manage to strike a balance between the two goals by i) recasting the original patrolling problem as a Graph Deep Learning problem; ii) directly solving this problem on the graph in the framework of cooperative multi-agent reinforcement learning. Treating the decisions of a team of agents as a sequence input, our model outputs the agents' actions in order by an autoregressive mechanism. Extensive simulation studies show that our approach has comparable performance with existing algorithms in terms of efficiency and outperforms them in terms of unpredictability. To our knowledge, this is the first work that successfully solves the patrolling problem with reinforcement learning on a graph.","PeriodicalId":360533,"journal":{"name":"2023 IEEE International Conference on Robotics and Automation (ICRA)","volume":"50 12","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International Conference on Robotics and Automation (ICRA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICRA48891.2023.10160923","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Patrolling with multiple robots is a challenging task. While the robots collaboratively and repeatedly cover the regions of interest in the environment, their routes should satisfy two often conflicting properties: i) (efficiency) the time intervals between two consecutive visits to the regions are small; ii) (unpredictability) the patrolling trajectories are random and unpredictable. We manage to strike a balance between the two goals by i) recasting the original patrolling problem as a Graph Deep Learning problem; ii) directly solving this problem on the graph in the framework of cooperative multi-agent reinforcement learning. Treating the decisions of a team of agents as a sequence input, our model outputs the agents' actions in order by an autoregressive mechanism. Extensive simulation studies show that our approach has comparable performance with existing algorithms in terms of efficiency and outperforms them in terms of unpredictability. To our knowledge, this is the first work that successfully solves the patrolling problem with reinforcement learning on a graph.

查看原文本刊更多论文

多机器人巡逻中效率与不可预测性的平衡:基于marl的方法

与多个机器人一起巡逻是一项具有挑战性的任务。当机器人协作并重复覆盖环境中感兴趣的区域时，它们的路线应该满足两个经常相互冲突的属性:i)(效率)两次连续访问区域之间的时间间隔很小;(不可预测性)巡逻轨迹是随机和不可预测的。我们设法在两个目标之间取得平衡:i)将原来的巡逻问题重新定义为图深度学习问题;Ii)在协同多智能体强化学习框架下直接在图上解决这一问题。将一组代理的决策作为序列输入，我们的模型通过自回归机制按顺序输出代理的行为。大量的仿真研究表明，我们的方法在效率方面与现有算法具有相当的性能，并且在不可预测性方面优于现有算法。据我们所知，这是第一个用强化学习在图上成功解决巡逻问题的工作。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2023 IEEE International Conference on Robotics and Automation (ICRA)

自引率

0.00%

发文量