{"title":"基于分布式策略进化的大规模仓库多智能体寻径强化学习","authors":"Qinru Shi;Meiqin Liu;Senlin Zhang;Xuguang Lan","doi":"10.1109/LRA.2025.3579647","DOIUrl":null,"url":null,"abstract":"Efficient multi-agent path finding (MAPF) is essential for large-scale warehousing and logistics systems. Despite the potential of reinforcement learning (RL) methods, current approaches struggle with challenges such as inefficient exploration, poor generalization and inadequate deadlock resolution. To address these issues, we propose a novel evolutionary reinforcement learning (ERL) framework to address the MAPF problem in large-scale warehouse environments. Specifically, the framework leverages distributed policy evolution methods to provide diverse experiences, thereby improving policy training efficiency and policy performance. We further integrate curriculum learning into this framework to improve the generality of the policy and make it scalable to larger environments. Additionally, we introduce a deadlock-breaking mechanism based on expert experience, helping to mitigate deadlock issues in large-scale and high-density scenarios. Experiments show that our method outperforms existing methods across various environments, particularly excelling in complex scenarios with over 1,000 agents.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 8","pages":"7843-7850"},"PeriodicalIF":4.6000,"publicationDate":"2025-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Reinforcement Learning for Multi-Agent Path Finding in Large-Scale Warehouses via Distributed Policy Evolution\",\"authors\":\"Qinru Shi;Meiqin Liu;Senlin Zhang;Xuguang Lan\",\"doi\":\"10.1109/LRA.2025.3579647\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Efficient multi-agent path finding (MAPF) is essential for large-scale warehousing and logistics systems. Despite the potential of reinforcement learning (RL) methods, current approaches struggle with challenges such as inefficient exploration, poor generalization and inadequate deadlock resolution. To address these issues, we propose a novel evolutionary reinforcement learning (ERL) framework to address the MAPF problem in large-scale warehouse environments. Specifically, the framework leverages distributed policy evolution methods to provide diverse experiences, thereby improving policy training efficiency and policy performance. We further integrate curriculum learning into this framework to improve the generality of the policy and make it scalable to larger environments. Additionally, we introduce a deadlock-breaking mechanism based on expert experience, helping to mitigate deadlock issues in large-scale and high-density scenarios. Experiments show that our method outperforms existing methods across various environments, particularly excelling in complex scenarios with over 1,000 agents.\",\"PeriodicalId\":13241,\"journal\":{\"name\":\"IEEE Robotics and Automation Letters\",\"volume\":\"10 8\",\"pages\":\"7843-7850\"},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2025-06-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Robotics and Automation Letters\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11034721/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ROBOTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Robotics and Automation Letters","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11034721/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ROBOTICS","Score":null,"Total":0}
Reinforcement Learning for Multi-Agent Path Finding in Large-Scale Warehouses via Distributed Policy Evolution
Efficient multi-agent path finding (MAPF) is essential for large-scale warehousing and logistics systems. Despite the potential of reinforcement learning (RL) methods, current approaches struggle with challenges such as inefficient exploration, poor generalization and inadequate deadlock resolution. To address these issues, we propose a novel evolutionary reinforcement learning (ERL) framework to address the MAPF problem in large-scale warehouse environments. Specifically, the framework leverages distributed policy evolution methods to provide diverse experiences, thereby improving policy training efficiency and policy performance. We further integrate curriculum learning into this framework to improve the generality of the policy and make it scalable to larger environments. Additionally, we introduce a deadlock-breaking mechanism based on expert experience, helping to mitigate deadlock issues in large-scale and high-density scenarios. Experiments show that our method outperforms existing methods across various environments, particularly excelling in complex scenarios with over 1,000 agents.
期刊介绍:
The scope of this journal is to publish peer-reviewed articles that provide a timely and concise account of innovative research ideas and application results, reporting significant theoretical findings and application case studies in areas of robotics and automation.