MCTS模拟策略的进化学习

James Pettit, D. Helmbold
{"title":"MCTS模拟策略的进化学习","authors":"James Pettit, D. Helmbold","doi":"10.1145/2282338.2282379","DOIUrl":null,"url":null,"abstract":"Monte-Carlo Tree Search (MCTS) grows a partial game tree and uses a large number of random simulations to approximate the values of the nodes. It has proven effective in games with such as Go and Hex where the large search space and difficulty of evaluating positions cause difficulties for standard methods. The best MCTS players use carefully hand-crafted rules to bias the random simulations. Obtaining good hand-crafting rules is a very difficult process, as even rules promoting better simulation play can result in a weaker MCTS system [12]. Our Hivemind system uses evolution strategies to automatically learn effective rules for biasing the random simulations. We have built a MCTS player using Hivemind for the game Hex. The Hivemind learned rules result in a 90% win rate against a baseline MCTS system, and significant improvement against the computer Hex world champion, MoHex.","PeriodicalId":92512,"journal":{"name":"FDG : proceedings of the International Conference on Foundations of Digital Games. International Conference on the Foundations of Digital Games","volume":"12 1","pages":"212-219"},"PeriodicalIF":0.0000,"publicationDate":"2012-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Evolutionary learning of policies for MCTS simulations\",\"authors\":\"James Pettit, D. Helmbold\",\"doi\":\"10.1145/2282338.2282379\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Monte-Carlo Tree Search (MCTS) grows a partial game tree and uses a large number of random simulations to approximate the values of the nodes. It has proven effective in games with such as Go and Hex where the large search space and difficulty of evaluating positions cause difficulties for standard methods. The best MCTS players use carefully hand-crafted rules to bias the random simulations. Obtaining good hand-crafting rules is a very difficult process, as even rules promoting better simulation play can result in a weaker MCTS system [12]. Our Hivemind system uses evolution strategies to automatically learn effective rules for biasing the random simulations. We have built a MCTS player using Hivemind for the game Hex. The Hivemind learned rules result in a 90% win rate against a baseline MCTS system, and significant improvement against the computer Hex world champion, MoHex.\",\"PeriodicalId\":92512,\"journal\":{\"name\":\"FDG : proceedings of the International Conference on Foundations of Digital Games. International Conference on the Foundations of Digital Games\",\"volume\":\"12 1\",\"pages\":\"212-219\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-05-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"FDG : proceedings of the International Conference on Foundations of Digital Games. International Conference on the Foundations of Digital Games\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2282338.2282379\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"FDG : proceedings of the International Conference on Foundations of Digital Games. International Conference on the Foundations of Digital Games","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2282338.2282379","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

蒙特卡罗树搜索(MCTS)生成一个局部博弈树,并使用大量随机模拟来近似节点的值。它在诸如围棋和十六进制等游戏中被证明是有效的,在这些游戏中,巨大的搜索空间和评估位置的难度导致了标准方法的困难。最优秀的MCTS玩家会使用精心设计的规则来影响随机模拟。获得良好的手工制作规则是一个非常困难的过程,因为即使是促进更好的模拟玩法的规则也可能导致较弱的MCTS系统[12]。我们的Hivemind系统使用进化策略来自动学习有效的规则来对随机模拟进行偏置。我们使用Hivemind为游戏Hex构建了一个MCTS播放器。Hivemind学习的规则在与MCTS基线系统的比赛中取得了90%的胜率,并且在与计算机Hex世界冠军MoHex的比赛中取得了显著的进步。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Evolutionary learning of policies for MCTS simulations
Monte-Carlo Tree Search (MCTS) grows a partial game tree and uses a large number of random simulations to approximate the values of the nodes. It has proven effective in games with such as Go and Hex where the large search space and difficulty of evaluating positions cause difficulties for standard methods. The best MCTS players use carefully hand-crafted rules to bias the random simulations. Obtaining good hand-crafting rules is a very difficult process, as even rules promoting better simulation play can result in a weaker MCTS system [12]. Our Hivemind system uses evolution strategies to automatically learn effective rules for biasing the random simulations. We have built a MCTS player using Hivemind for the game Hex. The Hivemind learned rules result in a 90% win rate against a baseline MCTS system, and significant improvement against the computer Hex world champion, MoHex.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信