{"title":"利用胶囊网络和注意机制学习多机器人任务分配","authors":"Steve Paul, Souma Chowdhury","doi":"10.1016/j.robot.2025.105085","DOIUrl":null,"url":null,"abstract":"<div><div>This paper presents a new graph reinforcement learning (RL) architecture to solve multi-robot task allocation (MRTA) problems without requiring any tedious heuristics. Multi-feature tasks are abstracted as nodes in an undirected graph in this case. The primary goal is to not only generalize across unseen problems of similar size but also scale to problems with much larger task spaces without retraining; which otherwise could be particularly expensive when simulating multi-robot operations. While drawing inspiration from the emerging paradigm in learning to solve combinatorial optimization (CO) problems, a new encoder–decoder architecture called Capsule Attention-based Mechanism or CAPAM is presented here to achieve this goal. More specifically, a novel choice of <em>encoder</em> is made in the form of graph capsule convolutional networks, which enables permutation invariant embeddings that capture the local and global structure of the task graph by using higher-order statistical moments of the vectors of node features. This encoded information is combined with a <em>context</em> component encoding mission and robot states, and processed through the <em>decoder</em> that computes the probability of selecting different available tasks by a robot. To train the CAPAM model, a policy-gradient method based on Proximal Policy Optimization is used. When evaluated over unseen scenarios, CAPAM demonstrates comparable task completion performance and faster decision-making compared to standard non-learning-based online MRTA methods. CAPAM demonstrates substantial gains in generalizability and (task) scalability in comparison to a popular approach for learning to solve CO problems (the pure attention mechanism) and preserves this performance advantage even under partial observation scenarios.</div></div>","PeriodicalId":49592,"journal":{"name":"Robotics and Autonomous Systems","volume":"193 ","pages":"Article 105085"},"PeriodicalIF":5.2000,"publicationDate":"2025-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Learning multi-robot task allocation using capsule networks and attention mechanism\",\"authors\":\"Steve Paul, Souma Chowdhury\",\"doi\":\"10.1016/j.robot.2025.105085\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>This paper presents a new graph reinforcement learning (RL) architecture to solve multi-robot task allocation (MRTA) problems without requiring any tedious heuristics. Multi-feature tasks are abstracted as nodes in an undirected graph in this case. The primary goal is to not only generalize across unseen problems of similar size but also scale to problems with much larger task spaces without retraining; which otherwise could be particularly expensive when simulating multi-robot operations. While drawing inspiration from the emerging paradigm in learning to solve combinatorial optimization (CO) problems, a new encoder–decoder architecture called Capsule Attention-based Mechanism or CAPAM is presented here to achieve this goal. More specifically, a novel choice of <em>encoder</em> is made in the form of graph capsule convolutional networks, which enables permutation invariant embeddings that capture the local and global structure of the task graph by using higher-order statistical moments of the vectors of node features. This encoded information is combined with a <em>context</em> component encoding mission and robot states, and processed through the <em>decoder</em> that computes the probability of selecting different available tasks by a robot. To train the CAPAM model, a policy-gradient method based on Proximal Policy Optimization is used. When evaluated over unseen scenarios, CAPAM demonstrates comparable task completion performance and faster decision-making compared to standard non-learning-based online MRTA methods. CAPAM demonstrates substantial gains in generalizability and (task) scalability in comparison to a popular approach for learning to solve CO problems (the pure attention mechanism) and preserves this performance advantage even under partial observation scenarios.</div></div>\",\"PeriodicalId\":49592,\"journal\":{\"name\":\"Robotics and Autonomous Systems\",\"volume\":\"193 \",\"pages\":\"Article 105085\"},\"PeriodicalIF\":5.2000,\"publicationDate\":\"2025-06-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Robotics and Autonomous Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S092188902500171X\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Robotics and Autonomous Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S092188902500171X","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
Learning multi-robot task allocation using capsule networks and attention mechanism
This paper presents a new graph reinforcement learning (RL) architecture to solve multi-robot task allocation (MRTA) problems without requiring any tedious heuristics. Multi-feature tasks are abstracted as nodes in an undirected graph in this case. The primary goal is to not only generalize across unseen problems of similar size but also scale to problems with much larger task spaces without retraining; which otherwise could be particularly expensive when simulating multi-robot operations. While drawing inspiration from the emerging paradigm in learning to solve combinatorial optimization (CO) problems, a new encoder–decoder architecture called Capsule Attention-based Mechanism or CAPAM is presented here to achieve this goal. More specifically, a novel choice of encoder is made in the form of graph capsule convolutional networks, which enables permutation invariant embeddings that capture the local and global structure of the task graph by using higher-order statistical moments of the vectors of node features. This encoded information is combined with a context component encoding mission and robot states, and processed through the decoder that computes the probability of selecting different available tasks by a robot. To train the CAPAM model, a policy-gradient method based on Proximal Policy Optimization is used. When evaluated over unseen scenarios, CAPAM demonstrates comparable task completion performance and faster decision-making compared to standard non-learning-based online MRTA methods. CAPAM demonstrates substantial gains in generalizability and (task) scalability in comparison to a popular approach for learning to solve CO problems (the pure attention mechanism) and preserves this performance advantage even under partial observation scenarios.
期刊介绍:
Robotics and Autonomous Systems will carry articles describing fundamental developments in the field of robotics, with special emphasis on autonomous systems. An important goal of this journal is to extend the state of the art in both symbolic and sensory based robot control and learning in the context of autonomous systems.
Robotics and Autonomous Systems will carry articles on the theoretical, computational and experimental aspects of autonomous systems, or modules of such systems.