Heuristic-Based Multi-Agent Monte Carlo Tree Search

IISA 2014, The 5th International Conference on Information, Intelligence, Systems and Applications Pub Date : 2014-07-07 DOI:10.1109/IISA.2014.6878747

E. López, Ruohua Li, C. Patsakis, S. Clarke, V. Cahill

{"title":"Heuristic-Based Multi-Agent Monte Carlo Tree Search","authors":"E. López, Ruohua Li, C. Patsakis, S. Clarke, V. Cahill","doi":"10.1109/IISA.2014.6878747","DOIUrl":null,"url":null,"abstract":"Monte Carlo Tree Search (MCTS) is a relatively new sampling best-first method to search for optimal decisions. The MCTS' popularity is based on its extraordinary results in the challenging two-player based game Go, a game considered much harder than Chess and that until very recently was considered unfeasible for Artificial Intelligence methods. Different MCTS variants have been proposed, mainly to enhance its capabilities. Perhaps, one of the main limitations of this approach is its applicability in scenarios where multiple agents (more than two) are required. Some works have made an attempt to overcome this limitation by using a vector of reward values for each agent and allowing the algorithm to find an optimal equilibrium strategy. Inspired by these approaches, in this work we make an effort to explore a new proposal for handling multiple agents in MCTS by using a vector of values of what the agents need to do (defined tasks) instead of a vector of rewards for each agent. To achieve this we use a rather simple, but powerful heuristic that estimates the desired values of this vector. That is, a set of values that could lead to the optimal completion of the task. We tested this idea in a real-world scenario rather than using it in games as traditionally done. The results achieved by our proposed approach, named Heuristic-Based Multi-Agent Monte Carlo Tree Search, indicate the feasibility of using heuristics in the MCTS algorithm in situations where more than two agents are required.","PeriodicalId":298835,"journal":{"name":"IISA 2014, The 5th International Conference on Information, Intelligence, Systems and Applications","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IISA 2014, The 5th International Conference on Information, Intelligence, Systems and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IISA.2014.6878747","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 13

Abstract

Monte Carlo Tree Search (MCTS) is a relatively new sampling best-first method to search for optimal decisions. The MCTS' popularity is based on its extraordinary results in the challenging two-player based game Go, a game considered much harder than Chess and that until very recently was considered unfeasible for Artificial Intelligence methods. Different MCTS variants have been proposed, mainly to enhance its capabilities. Perhaps, one of the main limitations of this approach is its applicability in scenarios where multiple agents (more than two) are required. Some works have made an attempt to overcome this limitation by using a vector of reward values for each agent and allowing the algorithm to find an optimal equilibrium strategy. Inspired by these approaches, in this work we make an effort to explore a new proposal for handling multiple agents in MCTS by using a vector of values of what the agents need to do (defined tasks) instead of a vector of rewards for each agent. To achieve this we use a rather simple, but powerful heuristic that estimates the desired values of this vector. That is, a set of values that could lead to the optimal completion of the task. We tested this idea in a real-world scenario rather than using it in games as traditionally done. The results achieved by our proposed approach, named Heuristic-Based Multi-Agent Monte Carlo Tree Search, indicate the feasibility of using heuristics in the MCTS algorithm in situations where more than two agents are required.

查看原文本刊更多论文

基于启发式的多智能体蒙特卡罗树搜索

蒙特卡罗树搜索(MCTS)是一种较新的抽样最佳优先搜索最优决策的方法。MCTS的受欢迎程度是基于它在具有挑战性的双人围棋中取得的非凡成绩，围棋被认为比国际象棋难得多，直到最近才被认为是人工智能方法不可行的。人们提出了不同的MCTS变体，主要是为了增强其能力。也许，这种方法的主要限制之一是它在需要多个代理(多于两个)的场景中的适用性。一些研究试图克服这一限制，通过为每个代理使用奖励值向量，并允许算法找到最优均衡策略。受这些方法的启发，在这项工作中，我们努力探索一种新的建议，通过使用代理需要做什么(定义任务)的值向量而不是每个代理的奖励向量来处理MCTS中的多个代理。为了实现这一点，我们使用了一个相当简单但功能强大的启发式方法来估计这个向量的期望值。也就是说，一组值可以导致任务的最佳完成。我们在现实场景中测试了这个想法，而不是像传统做法那样在游戏中使用它。我们提出的基于启发式的多智能体蒙特卡罗树搜索方法的结果表明，在需要两个以上智能体的情况下，在MCTS算法中使用启发式方法是可行的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IISA 2014, The 5th International Conference on Information, Intelligence, Systems and Applications

自引率

0.00%

发文量