Only-One-Victor Pattern Learning in Computer Go

Q2 Computer Science

IEEE Transactions on Computational Intelligence and AI in Games Pub Date : 2017-03-01 DOI:10.1109/TCIAIG.2015.2504108

Jiao Wang, Chenjun Xiao, Tan Zhu, Chu-Hsuan Hsueh, Wen-Jie Tseng, I-Chen Wu

{"title":"Only-One-Victor Pattern Learning in Computer Go","authors":"Jiao Wang, Chenjun Xiao, Tan Zhu, Chu-Hsuan Hsueh, Wen-Jie Tseng, I-Chen Wu","doi":"10.1109/TCIAIG.2015.2504108","DOIUrl":null,"url":null,"abstract":"Automatically acquiring domain knowledge from professional game records, a kind of pattern learning, is an attractive and challenging issue in computer Go. This paper proposes a supervised learning method, by introducing a new generalized Bradley-Terry model, named Only-One-Victor, to learn patterns from game records. Basically, our algorithm applies the same idea with Elo rating algorithm, which considers each move in game records as a group of move patterns, and the selected move as the winner of a kind of competition among all groups on current board. However, being different from the generalized Bradley-Terry model for group competition used in Elo rating algorithm, Only-One-Victor model in our work simulates the process of making selection from a set of possible candidates by considering such process as a group of independent pairwise comparisons. We use a graph theory model to prove the correctness of Only-One-Victor model. In addition, we also apply the Minorization-Maximization (MM) to solve the optimization task. Therefore, our algorithm still enjoys many computational advantages of Elo rating algorithm, such as the scalability with high dimensional feature space. With the training set containing 115,832 moves and the same feature setting, the results of our experiments show that Only-One-Victor outperforms Elo rating, a well-known best supervised pattern learning method.","PeriodicalId":49192,"journal":{"name":"IEEE Transactions on Computational Intelligence and AI in Games","volume":"9 1","pages":"88-102"},"PeriodicalIF":0.0000,"publicationDate":"2017-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TCIAIG.2015.2504108","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computational Intelligence and AI in Games","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TCIAIG.2015.2504108","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Computer Science","Score":null,"Total":0}

引用次数: 1

Abstract

Automatically acquiring domain knowledge from professional game records, a kind of pattern learning, is an attractive and challenging issue in computer Go. This paper proposes a supervised learning method, by introducing a new generalized Bradley-Terry model, named Only-One-Victor, to learn patterns from game records. Basically, our algorithm applies the same idea with Elo rating algorithm, which considers each move in game records as a group of move patterns, and the selected move as the winner of a kind of competition among all groups on current board. However, being different from the generalized Bradley-Terry model for group competition used in Elo rating algorithm, Only-One-Victor model in our work simulates the process of making selection from a set of possible candidates by considering such process as a group of independent pairwise comparisons. We use a graph theory model to prove the correctness of Only-One-Victor model. In addition, we also apply the Minorization-Maximization (MM) to solve the optimization task. Therefore, our algorithm still enjoys many computational advantages of Elo rating algorithm, such as the scalability with high dimensional feature space. With the training set containing 115,832 moves and the same feature setting, the results of our experiments show that Only-One-Victor outperforms Elo rating, a well-known best supervised pattern learning method.

查看原文本刊更多论文

计算机围棋中的唯一胜利者模式学习

从专业棋局记录中自动获取领域知识是一种模式学习，是计算机围棋研究中一个具有吸引力和挑战性的课题。本文提出了一种监督学习方法，通过引入一种新的广义布拉德利-特里模型(Only-One-Victor)从游戏记录中学习模式。基本上，我们的算法应用了与Elo评级算法相同的思想，它将游戏记录中的每个移动视为一组移动模式，并将选择的移动作为当前棋盘上所有组之间某种竞争的获胜者。然而，与Elo评级算法中使用的群体竞争的广义Bradley-Terry模型不同，我们的Only-One-Victor模型将从一组可能的候选人中进行选择的过程看作是一组独立的两两比较。我们用图论模型证明了只有一个胜利者模型的正确性。此外，我们还应用最小化-最大化(MM)来解决优化任务。因此，我们的算法仍然具有Elo评级算法的许多计算优势，例如高维特征空间的可扩展性。在包含115,832步的训练集和相同的特征设置下，我们的实验结果表明，Only-One-Victor优于Elo评级，这是一种众所周知的最佳监督模式学习方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Computational Intelligence and AI in Games COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-COMPUTER SCIENCE, SOFTWARE ENGINEERING

CiteScore

4.60

自引率

0.00%

发文量

审稿时长

>12 weeks

期刊介绍： Cessation. The IEEE Transactions on Computational Intelligence and AI in Games (T-CIAIG) publishes archival journal quality original papers in computational intelligence and related areas in artificial intelligence applied to games, including but not limited to videogames, mathematical games, human–computer interactions in games, and games involving physical objects. Emphasis is placed on the use of these methods to improve performance in and understanding of the dynamics of games, as well as gaining insight into the properties of the methods as applied to games. It also includes using games as a platform for building intelligent embedded agents for the real world. Papers connecting games to all areas of computational intelligence and traditional AI are considered.