Anton Andreychuk, Konstantin Yakovlev, Aleksandr Panov, Alexey Skrynnik
{"title":"MAPF-GPT:多代理规模寻路的模仿学习","authors":"Anton Andreychuk, Konstantin Yakovlev, Aleksandr Panov, Alexey Skrynnik","doi":"arxiv-2409.00134","DOIUrl":null,"url":null,"abstract":"Multi-agent pathfinding (MAPF) is a challenging computational problem that\ntypically requires to find collision-free paths for multiple agents in a shared\nenvironment. Solving MAPF optimally is NP-hard, yet efficient solutions are\ncritical for numerous applications, including automated warehouses and\ntransportation systems. Recently, learning-based approaches to MAPF have gained\nattention, particularly those leveraging deep reinforcement learning. Following\ncurrent trends in machine learning, we have created a foundation model for the\nMAPF problems called MAPF-GPT. Using imitation learning, we have trained a\npolicy on a set of pre-collected sub-optimal expert trajectories that can\ngenerate actions in conditions of partial observability without additional\nheuristics, reward functions, or communication with other agents. The resulting\nMAPF-GPT model demonstrates zero-shot learning abilities when solving the MAPF\nproblem instances that were not present in the training dataset. We show that\nMAPF-GPT notably outperforms the current best-performing learnable-MAPF solvers\non a diverse range of problem instances and is efficient in terms of\ncomputation (in the inference mode).","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"54 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MAPF-GPT: Imitation Learning for Multi-Agent Pathfinding at Scale\",\"authors\":\"Anton Andreychuk, Konstantin Yakovlev, Aleksandr Panov, Alexey Skrynnik\",\"doi\":\"arxiv-2409.00134\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Multi-agent pathfinding (MAPF) is a challenging computational problem that\\ntypically requires to find collision-free paths for multiple agents in a shared\\nenvironment. Solving MAPF optimally is NP-hard, yet efficient solutions are\\ncritical for numerous applications, including automated warehouses and\\ntransportation systems. Recently, learning-based approaches to MAPF have gained\\nattention, particularly those leveraging deep reinforcement learning. Following\\ncurrent trends in machine learning, we have created a foundation model for the\\nMAPF problems called MAPF-GPT. Using imitation learning, we have trained a\\npolicy on a set of pre-collected sub-optimal expert trajectories that can\\ngenerate actions in conditions of partial observability without additional\\nheuristics, reward functions, or communication with other agents. The resulting\\nMAPF-GPT model demonstrates zero-shot learning abilities when solving the MAPF\\nproblem instances that were not present in the training dataset. We show that\\nMAPF-GPT notably outperforms the current best-performing learnable-MAPF solvers\\non a diverse range of problem instances and is efficient in terms of\\ncomputation (in the inference mode).\",\"PeriodicalId\":501315,\"journal\":{\"name\":\"arXiv - CS - Multiagent Systems\",\"volume\":\"54 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Multiagent Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.00134\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Multiagent Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.00134","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
MAPF-GPT: Imitation Learning for Multi-Agent Pathfinding at Scale
Multi-agent pathfinding (MAPF) is a challenging computational problem that
typically requires to find collision-free paths for multiple agents in a shared
environment. Solving MAPF optimally is NP-hard, yet efficient solutions are
critical for numerous applications, including automated warehouses and
transportation systems. Recently, learning-based approaches to MAPF have gained
attention, particularly those leveraging deep reinforcement learning. Following
current trends in machine learning, we have created a foundation model for the
MAPF problems called MAPF-GPT. Using imitation learning, we have trained a
policy on a set of pre-collected sub-optimal expert trajectories that can
generate actions in conditions of partial observability without additional
heuristics, reward functions, or communication with other agents. The resulting
MAPF-GPT model demonstrates zero-shot learning abilities when solving the MAPF
problem instances that were not present in the training dataset. We show that
MAPF-GPT notably outperforms the current best-performing learnable-MAPF solvers
on a diverse range of problem instances and is efficient in terms of
computation (in the inference mode).