Triple Line-Based Playout for Go - An Accelerator for Monte Carlo Go

Kenichi Koizumi, M. Inaba, K. Hiraki, Y. Ishii, T. Miyoshi, Kazuki Yoshizoe
{"title":"Triple Line-Based Playout for Go - An Accelerator for Monte Carlo Go","authors":"Kenichi Koizumi, M. Inaba, K. Hiraki, Y. Ishii, T. Miyoshi, Kazuki Yoshizoe","doi":"10.1109/ReConFig.2009.75","DOIUrl":null,"url":null,"abstract":"After a computer named “Deep Blue” defeated the world chess champion Garry Kasparov in 1997, researchers studying computer board games focused their attention on the game “Go.” Go is known to be more difficult for computers to play than chess or shogi because (1) the search space for Go is much larger, (2) it is difficult to define an appropriate evaluation function of position, and (3) a position sometimes changes globally in just one move. Recently, a new method called Monte Carlo Go has been developed, which involves performing Monte Carlo simulations to evaluate a position. Monte Carlo Go increases the strength of the Computer-Go program. For Monte Carlo Go, the strength fully depends on the number of simulations. Several attempts were made to accelerate simulations, e.g., by the use of cluster systems and FPGAs. The cluster system yields good results, but it is a very expensive system. On the other hand, acceleration using an FPGA was not so easy because the usage of FPGA resources tends to be high. Previously, FPGA acceleration was feasible for smaller board such as a board with a 9 × 9 grid, while it was not feasible for the standard board with a 19 × 19 grid. In this paper, we propose triple line-based playout for Go (TLPG), a hardware algorithm for generating simulations using an FPGA. By reproducing global information redundantly, TLPG enables the generation of simulations only using local operations; this helps realize compact implementations of hardware logic, and thus, TLPG can handle both 9 × 9 and 19 × 19 grid Go boards. We implement TLPG on Xilinx Virtex-5 (XC5VFX70T-1FF1136) and evaluate it. TLPG can perform 40,649 playouts per second for a 9 × 9 grid Go board and 4,668 playouts per second for a 19 × 19 grid Go board.","PeriodicalId":325631,"journal":{"name":"2009 International Conference on Reconfigurable Computing and FPGAs","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 International Conference on Reconfigurable Computing and FPGAs","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ReConFig.2009.75","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

After a computer named “Deep Blue” defeated the world chess champion Garry Kasparov in 1997, researchers studying computer board games focused their attention on the game “Go.” Go is known to be more difficult for computers to play than chess or shogi because (1) the search space for Go is much larger, (2) it is difficult to define an appropriate evaluation function of position, and (3) a position sometimes changes globally in just one move. Recently, a new method called Monte Carlo Go has been developed, which involves performing Monte Carlo simulations to evaluate a position. Monte Carlo Go increases the strength of the Computer-Go program. For Monte Carlo Go, the strength fully depends on the number of simulations. Several attempts were made to accelerate simulations, e.g., by the use of cluster systems and FPGAs. The cluster system yields good results, but it is a very expensive system. On the other hand, acceleration using an FPGA was not so easy because the usage of FPGA resources tends to be high. Previously, FPGA acceleration was feasible for smaller board such as a board with a 9 × 9 grid, while it was not feasible for the standard board with a 19 × 19 grid. In this paper, we propose triple line-based playout for Go (TLPG), a hardware algorithm for generating simulations using an FPGA. By reproducing global information redundantly, TLPG enables the generation of simulations only using local operations; this helps realize compact implementations of hardware logic, and thus, TLPG can handle both 9 × 9 and 19 × 19 grid Go boards. We implement TLPG on Xilinx Virtex-5 (XC5VFX70T-1FF1136) and evaluate it. TLPG can perform 40,649 playouts per second for a 9 × 9 grid Go board and 4,668 playouts per second for a 19 × 19 grid Go board.
基于三重线的围棋播放-蒙特卡洛围棋加速器
1997年,一台名为“深蓝”的计算机击败了国际象棋世界冠军加里·卡斯帕罗夫(Garry Kasparov)后,研究计算机棋盘游戏的研究人员将注意力集中在了“围棋”上。众所周知,对于计算机来说,围棋比国际象棋或幕府棋更难下,因为(1)围棋的搜索空间要大得多,(2)很难定义一个适当的位置评估函数,(3)一个位置有时会在一次移动中全局改变。最近,一种叫做蒙特卡罗围棋的新方法被开发出来,它涉及到进行蒙特卡罗模拟来评估一个位置。蒙特卡罗围棋增加了计算机围棋程序的强度。对于蒙特卡罗围棋,其强度完全取决于模拟次数。为了加速仿真,我们做了一些尝试,例如使用集群系统和fpga。集群系统产生良好的结果,但它是一个非常昂贵的系统。另一方面,使用FPGA加速不是那么容易,因为FPGA资源的使用往往很高。以前,FPGA加速对于较小的板是可行的,例如具有9 × 9网格的板,而对于具有19 × 19网格的标准板则不可行的。在本文中,我们提出了基于三线的围棋游戏(TLPG),这是一种使用FPGA生成模拟的硬件算法。通过冗余复制全局信息,TLPG可以仅使用局部操作生成模拟;这有助于实现硬件逻辑的紧凑实现,因此,TLPG可以处理9 × 9和19 × 19网格围棋棋盘。我们在Xilinx Virtex-5 (XC5VFX70T-1FF1136)上实现了TLPG并对其进行了评估。对于一个9 × 9格的围棋棋盘,TLPG每秒可以执行40,649次下棋,对于一个19 × 19格的围棋棋盘,每秒可以执行4,668次下棋。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信