Physics-Informed Multiagent Reinforcement Learning for Distributed Multirobot Problems

IF 10.5 1区计算机科学 Q1 ROBOTICS

IEEE Transactions on Robotics Pub Date : 2025-06-24 DOI:10.1109/TRO.2025.3582836

Eduardo Sebastián;Thai Duong;Nikolay Atanasov;Eduardo Montijano;Carlos Sagüés

{"title":"Physics-Informed Multiagent Reinforcement Learning for Distributed Multirobot Problems","authors":"Eduardo Sebastián;Thai Duong;Nikolay Atanasov;Eduardo Montijano;Carlos Sagüés","doi":"10.1109/TRO.2025.3582836","DOIUrl":null,"url":null,"abstract":"The networked nature of multirobot systems presents challenges in the context of multiagent reinforcement learning. Centralized control policies do not scale with increasing numbers of robots, whereas independent control policies do not exploit the information provided by other robots, exhibiting poor performance in cooperative-competitive tasks. In this work, we propose a physics-informed reinforcement learning approach able to learn distributed multirobot control policies that are both scalable and make use of all the available information to each robot. Our approach has three key characteristics. First, it imposes a port-Hamiltonian structure on the policy representation, respecting energy conservation properties of physical robot systems and the networked nature of robot team interactions. Second, it uses self-attention to ensure a sparse policy representation able to handle time-varying information at each robot from the interaction graph. Third, we present a soft actor–critic reinforcement learning algorithm parameterized by our self-attention port-Hamiltonian control policy, which accounts for the correlation among robots during training while overcoming the need of value function factorization. Extensive simulations in different multirobot scenarios demonstrate the success of the proposed approach, surpassing previous multirobot reinforcement learning solutions in scalability, while achieving similar or superior performance (with averaged cumulative reward up to <inline-formula><tex-math>$\\times {\\text{2}}$</tex-math></inline-formula> greater than the state-of-the-art with robot teams <inline-formula><tex-math>$\\times {\\text{6}}$</tex-math></inline-formula> larger than the number of robots at training time). We also validate our approach on multiple real robots in the Georgia Tech Robotarium under imperfect communication, demonstrating zero-shot sim-to-real transfer and scalability across number of robots.","PeriodicalId":50388,"journal":{"name":"IEEE Transactions on Robotics","volume":"41 ","pages":"4499-4517"},"PeriodicalIF":10.5000,"publicationDate":"2025-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11049031","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Robotics","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11049031/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ROBOTICS","Score":null,"Total":0}

引用次数: 0

Abstract

The networked nature of multirobot systems presents challenges in the context of multiagent reinforcement learning. Centralized control policies do not scale with increasing numbers of robots, whereas independent control policies do not exploit the information provided by other robots, exhibiting poor performance in cooperative-competitive tasks. In this work, we propose a physics-informed reinforcement learning approach able to learn distributed multirobot control policies that are both scalable and make use of all the available information to each robot. Our approach has three key characteristics. First, it imposes a port-Hamiltonian structure on the policy representation, respecting energy conservation properties of physical robot systems and the networked nature of robot team interactions. Second, it uses self-attention to ensure a sparse policy representation able to handle time-varying information at each robot from the interaction graph. Third, we present a soft actor–critic reinforcement learning algorithm parameterized by our self-attention port-Hamiltonian control policy, which accounts for the correlation among robots during training while overcoming the need of value function factorization. Extensive simulations in different multirobot scenarios demonstrate the success of the proposed approach, surpassing previous multirobot reinforcement learning solutions in scalability, while achieving similar or superior performance (with averaged cumulative reward up to

$\times {\text{2}}$

greater than the state-of-the-art with robot teams

$\times {\text{6}}$

larger than the number of robots at training time). We also validate our approach on multiple real robots in the Georgia Tech Robotarium under imperfect communication, demonstrating zero-shot sim-to-real transfer and scalability across number of robots.

查看原文本刊更多论文

分布式多机器人问题的物理信息多智能体强化学习

多机器人系统的网络特性在多智能体强化学习的背景下提出了挑战。集中式控制策略不能随机器人数量的增加而扩展，而独立控制策略不能利用其他机器人提供的信息，在合作竞争任务中表现不佳。在这项工作中，我们提出了一种物理知情的强化学习方法，能够学习分布式多机器人控制策略，这些策略既可扩展，又利用每个机器人的所有可用信息。我们的方法有三个关键特点。首先，它在策略表示上施加了一个端口-哈密顿结构，尊重物理机器人系统的节能特性和机器人团队相互作用的网络化性质。其次，它使用自关注来确保稀疏的策略表示能够处理来自交互图的每个机器人的时变信息。第三，我们提出了一种由自关注端口-哈密顿控制策略参数化的软行为者-批评强化学习算法，该算法考虑了机器人在训练过程中的相关性，同时克服了价值函数分解的需要。在不同多机器人场景中的广泛模拟证明了所提出方法的成功，在可扩展性方面超越了以前的多机器人强化学习解决方案，同时实现了类似或更好的性能（平均累积奖励高达$\times {\text{2}}$大于最先进的机器人团队$\times {\text{6}}$大于训练时的机器人数量）。我们还在乔治亚理工学院机器人馆的多个真实机器人上验证了我们的方法，在不完全通信的情况下，展示了零镜头模拟到真实的转移和跨多个机器人的可扩展性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Robotics 工程技术-机器人学

CiteScore

14.90

自引率

5.10%

发文量

259

审稿时长

6.0 months

期刊介绍： The IEEE Transactions on Robotics (T-RO) is dedicated to publishing fundamental papers covering all facets of robotics, drawing on interdisciplinary approaches from computer science, control systems, electrical engineering, mathematics, mechanical engineering, and beyond. From industrial applications to service and personal assistants, surgical operations to space, underwater, and remote exploration, robots and intelligent machines play pivotal roles across various domains, including entertainment, safety, search and rescue, military applications, agriculture, and intelligent vehicles. Special emphasis is placed on intelligent machines and systems designed for unstructured environments, where a significant portion of the environment remains unknown and beyond direct sensing or control.