Ifrah Saeed , Andrew C. Cullen , Zainab Zaidi , Sarah Erfani , Tansu Alpcan
{"title":"导航中多智能体强化学习的算法设计奖励塑造","authors":"Ifrah Saeed , Andrew C. Cullen , Zainab Zaidi , Sarah Erfani , Tansu Alpcan","doi":"10.1016/j.neucom.2025.131654","DOIUrl":null,"url":null,"abstract":"<div><div>The practical applicability of multiagent reinforcement learning is hindered by its low sample efficiency and slow learning speed. While reward shaping and expert guidance can partially mitigate these challenges, their efficiency is offset by the need for substantial manual effort. To address these constraints, we introduce Multiagent Environment-aware semi-Automated Guide (MEAG), a novel framework that leverages widely known, highly efficient, and low-resolution single-agent pathfinding algorithms for shaping rewards to guide multiagent reinforcement learning agents. MEAG uses these single-agent solvers over a coarse-grid surrogate that requires minimal manual intervention, and guides agents away from random exploration in a manner that significantly reduces computational costs. When tested across a range of densely and sparsely connected multiagent navigation environments, MEAG consistently outperforms state-of-the-art algorithms, achieving up to <span><math><mn>50</mn><mspace></mspace><mi>%</mi></math></span> faster convergence and <span><math><mn>20</mn><mspace></mspace><mi>%</mi></math></span> higher rewards. These improvements enable the consideration of MARL for more complex real-world pathfinding applications ranging from warehouse automation to search and rescue operations, and swarm robotics.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"658 ","pages":"Article 131654"},"PeriodicalIF":6.5000,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Algorithmically-designed reward shaping for multiagent reinforcement learning in navigation\",\"authors\":\"Ifrah Saeed , Andrew C. Cullen , Zainab Zaidi , Sarah Erfani , Tansu Alpcan\",\"doi\":\"10.1016/j.neucom.2025.131654\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The practical applicability of multiagent reinforcement learning is hindered by its low sample efficiency and slow learning speed. While reward shaping and expert guidance can partially mitigate these challenges, their efficiency is offset by the need for substantial manual effort. To address these constraints, we introduce Multiagent Environment-aware semi-Automated Guide (MEAG), a novel framework that leverages widely known, highly efficient, and low-resolution single-agent pathfinding algorithms for shaping rewards to guide multiagent reinforcement learning agents. MEAG uses these single-agent solvers over a coarse-grid surrogate that requires minimal manual intervention, and guides agents away from random exploration in a manner that significantly reduces computational costs. When tested across a range of densely and sparsely connected multiagent navigation environments, MEAG consistently outperforms state-of-the-art algorithms, achieving up to <span><math><mn>50</mn><mspace></mspace><mi>%</mi></math></span> faster convergence and <span><math><mn>20</mn><mspace></mspace><mi>%</mi></math></span> higher rewards. These improvements enable the consideration of MARL for more complex real-world pathfinding applications ranging from warehouse automation to search and rescue operations, and swarm robotics.</div></div>\",\"PeriodicalId\":19268,\"journal\":{\"name\":\"Neurocomputing\",\"volume\":\"658 \",\"pages\":\"Article 131654\"},\"PeriodicalIF\":6.5000,\"publicationDate\":\"2025-09-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neurocomputing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0925231225023264\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231225023264","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Algorithmically-designed reward shaping for multiagent reinforcement learning in navigation
The practical applicability of multiagent reinforcement learning is hindered by its low sample efficiency and slow learning speed. While reward shaping and expert guidance can partially mitigate these challenges, their efficiency is offset by the need for substantial manual effort. To address these constraints, we introduce Multiagent Environment-aware semi-Automated Guide (MEAG), a novel framework that leverages widely known, highly efficient, and low-resolution single-agent pathfinding algorithms for shaping rewards to guide multiagent reinforcement learning agents. MEAG uses these single-agent solvers over a coarse-grid surrogate that requires minimal manual intervention, and guides agents away from random exploration in a manner that significantly reduces computational costs. When tested across a range of densely and sparsely connected multiagent navigation environments, MEAG consistently outperforms state-of-the-art algorithms, achieving up to faster convergence and higher rewards. These improvements enable the consideration of MARL for more complex real-world pathfinding applications ranging from warehouse automation to search and rescue operations, and swarm robotics.
期刊介绍:
Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.