PIPO: Policy Optimization with Permutation-Invariant Constraint for Distributed Multi-Robot Navigation

2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI) Pub Date : 2022-09-20 DOI:10.1109/MFI55806.2022.9913862

Ruiqi Zhang, Guang Chen, Jing Hou, Zhijun Li, Alois Knoll

{"title":"PIPO: Policy Optimization with Permutation-Invariant Constraint for Distributed Multi-Robot Navigation","authors":"Ruiqi Zhang, Guang Chen, Jing Hou, Zhijun Li, Alois Knoll","doi":"10.1109/MFI55806.2022.9913862","DOIUrl":null,"url":null,"abstract":"For large-scale multi-agent systems (MAS), ensuring the safety and effectiveness of navigation in complicated scenarios is a challenging task. With the agent scale increasing, most existing centralized methods lose their magic for the lack of scalability, and the popular decentralized approaches are hampered by high latency and computing requirements. In this research, we offer PIPO, a novel policy optimization algorithm for decentralized MAS navigation with permutation-invariant constraints. To conduct navigation and avoid un-necessary exploration in the early episodes, we first defined a guide-policy. Then, we introduce the permutation invariant property in decentralized multi-agent systems and leverage the graph convolution network to produce the same output under shuffled observations. Our approach can be easily scaled to an arbitrary number of agents and used in large-scale systems for its decentralized training and execution. We also provide extensive experiments to demonstrate that our PIPO significantly outperforms the baselines of multi-agent reinforcement learning algorithms and other leading methods in variant scenarios.","PeriodicalId":344737,"journal":{"name":"2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MFI55806.2022.9913862","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

For large-scale multi-agent systems (MAS), ensuring the safety and effectiveness of navigation in complicated scenarios is a challenging task. With the agent scale increasing, most existing centralized methods lose their magic for the lack of scalability, and the popular decentralized approaches are hampered by high latency and computing requirements. In this research, we offer PIPO, a novel policy optimization algorithm for decentralized MAS navigation with permutation-invariant constraints. To conduct navigation and avoid un-necessary exploration in the early episodes, we first defined a guide-policy. Then, we introduce the permutation invariant property in decentralized multi-agent systems and leverage the graph convolution network to produce the same output under shuffled observations. Our approach can be easily scaled to an arbitrary number of agents and used in large-scale systems for its decentralized training and execution. We also provide extensive experiments to demonstrate that our PIPO significantly outperforms the baselines of multi-agent reinforcement learning algorithms and other leading methods in variant scenarios.

查看原文本刊更多论文

基于排列不变约束的分布式多机器人导航策略优化

对于大规模多智能体系统(MAS)来说，确保复杂场景下导航的安全性和有效性是一项具有挑战性的任务。随着智能体规模的增加，大多数现有的集中式方法由于缺乏可扩展性而失去了魔力，而流行的分散方法则受到高延迟和计算需求的阻碍。在本研究中，我们提出了一种新的具有排列不变约束的分散MAS导航策略优化算法PIPO。为了在早期章节中进行导航并避免不必要的探索，我们首先定义了一个指南策略。然后，我们引入了分散多智能体系统的排列不变性，并利用图卷积网络在洗牌观测下产生相同的输出。我们的方法可以很容易地扩展到任意数量的代理，并用于大规模系统的分散训练和执行。我们还提供了大量的实验来证明我们的PIPO在不同场景下显著优于多智能体强化学习算法和其他领先方法的基线。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)

自引率

0.00%

发文量