Multiagent allocation of Markov decision process tasks

2013 American Control Conference Pub Date : 2013-06-17 DOI:10.1109/ACC.2013.6580186

Trevor Campbell, Luke B. Johnson, J. How

{"title":"Multiagent allocation of Markov decision process tasks","authors":"Trevor Campbell, Luke B. Johnson, J. How","doi":"10.1109/ACC.2013.6580186","DOIUrl":null,"url":null,"abstract":"Producing task assignments for multiagent teams often leads to an exponential growth in the decision space as the number of agents and objectives increases. One approach to finding a task assignment is to model the agents and the environment as a single Markov decision process, and solve the planning problem using standard MDP techniques. However, both exact and approximate MDP solvers in this environment struggle to produce assignments even for problems involving few agents and objectives. Conversely, problem formulations based upon mathematical programming typically scale well with the problem size at the expense of requiring comparatively simple agent and task models. This paper combines these two formulations by modeling task and agent dynamics using MDPs, and then using optimization techniques to solve the combinatorial problem of assigning tasks to agents. The computational complexity of the resulting algorithm is polynomial in the number of tasks and is constant in the number of agents. Simulation results are provided which highlight the performance of the algorithm in a grid world mobile target surveillance scenario, while demonstrating that these techniques can be extended to even larger tasking domains.","PeriodicalId":145065,"journal":{"name":"2013 American Control Conference","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 American Control Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACC.2013.6580186","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 13

Abstract

Producing task assignments for multiagent teams often leads to an exponential growth in the decision space as the number of agents and objectives increases. One approach to finding a task assignment is to model the agents and the environment as a single Markov decision process, and solve the planning problem using standard MDP techniques. However, both exact and approximate MDP solvers in this environment struggle to produce assignments even for problems involving few agents and objectives. Conversely, problem formulations based upon mathematical programming typically scale well with the problem size at the expense of requiring comparatively simple agent and task models. This paper combines these two formulations by modeling task and agent dynamics using MDPs, and then using optimization techniques to solve the combinatorial problem of assigning tasks to agents. The computational complexity of the resulting algorithm is polynomial in the number of tasks and is constant in the number of agents. Simulation results are provided which highlight the performance of the algorithm in a grid world mobile target surveillance scenario, while demonstrating that these techniques can be extended to even larger tasking domains.

查看原文本刊更多论文

马尔可夫决策过程任务的多智能体分配

随着智能体和目标数量的增加，为多智能体团队生成任务分配通常会导致决策空间呈指数增长。寻找任务分配的一种方法是将代理和环境建模为单个马尔可夫决策过程，并使用标准MDP技术解决规划问题。然而，在这种环境中，精确的和近似的MDP解算器都难以产生分配，甚至对于涉及少量代理和目标的问题也是如此。相反，基于数学规划的问题表述通常可以很好地随问题规模扩展，代价是需要相对简单的代理和任务模型。本文将这两种方法结合起来，利用mdp对任务和智能体动态建模，然后利用优化技术解决任务分配给智能体的组合问题。所得算法的计算复杂度与任务数成多项式关系，与智能体数成常数关系。仿真结果突出了该算法在网格世界移动目标监视场景中的性能，同时证明了这些技术可以扩展到更大的任务域。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2013 American Control Conference

自引率

0.00%

发文量