Anton Andreychuk, Konstantin Yakovlev, Aleksandr Panov, Alexey Skrynnik
{"title":"MAPF-GPT: Imitation Learning for Multi-Agent Pathfinding at Scale","authors":"Anton Andreychuk, Konstantin Yakovlev, Aleksandr Panov, Alexey Skrynnik","doi":"arxiv-2409.00134","DOIUrl":null,"url":null,"abstract":"Multi-agent pathfinding (MAPF) is a challenging computational problem that\ntypically requires to find collision-free paths for multiple agents in a shared\nenvironment. Solving MAPF optimally is NP-hard, yet efficient solutions are\ncritical for numerous applications, including automated warehouses and\ntransportation systems. Recently, learning-based approaches to MAPF have gained\nattention, particularly those leveraging deep reinforcement learning. Following\ncurrent trends in machine learning, we have created a foundation model for the\nMAPF problems called MAPF-GPT. Using imitation learning, we have trained a\npolicy on a set of pre-collected sub-optimal expert trajectories that can\ngenerate actions in conditions of partial observability without additional\nheuristics, reward functions, or communication with other agents. The resulting\nMAPF-GPT model demonstrates zero-shot learning abilities when solving the MAPF\nproblem instances that were not present in the training dataset. We show that\nMAPF-GPT notably outperforms the current best-performing learnable-MAPF solvers\non a diverse range of problem instances and is efficient in terms of\ncomputation (in the inference mode).","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"54 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Multiagent Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.00134","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Multi-agent pathfinding (MAPF) is a challenging computational problem that
typically requires to find collision-free paths for multiple agents in a shared
environment. Solving MAPF optimally is NP-hard, yet efficient solutions are
critical for numerous applications, including automated warehouses and
transportation systems. Recently, learning-based approaches to MAPF have gained
attention, particularly those leveraging deep reinforcement learning. Following
current trends in machine learning, we have created a foundation model for the
MAPF problems called MAPF-GPT. Using imitation learning, we have trained a
policy on a set of pre-collected sub-optimal expert trajectories that can
generate actions in conditions of partial observability without additional
heuristics, reward functions, or communication with other agents. The resulting
MAPF-GPT model demonstrates zero-shot learning abilities when solving the MAPF
problem instances that were not present in the training dataset. We show that
MAPF-GPT notably outperforms the current best-performing learnable-MAPF solvers
on a diverse range of problem instances and is efficient in terms of
computation (in the inference mode).