{"title":"基于时间差异的神经围棋棋手学习的棋盘表示","authors":"H. A. Mayer","doi":"10.1109/CIG.2007.368096","DOIUrl":null,"url":null,"abstract":"The majority of work on artificial neural networks (ANNs) playing the game of Go focus on network architectures and training regimes to improve the quality of the neural player. A less investigated problem is the board representation conveying the information on the current state of the game to the network. Common approaches suggest a straight-forward encoding by assigning each point on the board to a single (or more) input neurons. However, these basic representations do not capture elementary structural relationships between stones (and points) being essential to the game. We compare three different board representations for self-learning ANNs on a 5 times 5 board employing temporal difference learning (TDL) with two types of move selection (during training). The strength of the trained networks is evaluated in games against three computer players of different quality. A tournament of the best neural players, addition of alpha-beta search, and a commented game of a neural player against the best computer player further explore the potential of the neural players and its respective board representations","PeriodicalId":365269,"journal":{"name":"2007 IEEE Symposium on Computational Intelligence and Games","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"Board Representations for Neural Go Players Learning by Temporal Difference\",\"authors\":\"H. A. Mayer\",\"doi\":\"10.1109/CIG.2007.368096\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The majority of work on artificial neural networks (ANNs) playing the game of Go focus on network architectures and training regimes to improve the quality of the neural player. A less investigated problem is the board representation conveying the information on the current state of the game to the network. Common approaches suggest a straight-forward encoding by assigning each point on the board to a single (or more) input neurons. However, these basic representations do not capture elementary structural relationships between stones (and points) being essential to the game. We compare three different board representations for self-learning ANNs on a 5 times 5 board employing temporal difference learning (TDL) with two types of move selection (during training). The strength of the trained networks is evaluated in games against three computer players of different quality. A tournament of the best neural players, addition of alpha-beta search, and a commented game of a neural player against the best computer player further explore the potential of the neural players and its respective board representations\",\"PeriodicalId\":365269,\"journal\":{\"name\":\"2007 IEEE Symposium on Computational Intelligence and Games\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2007 IEEE Symposium on Computational Intelligence and Games\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CIG.2007.368096\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 IEEE Symposium on Computational Intelligence and Games","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIG.2007.368096","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Board Representations for Neural Go Players Learning by Temporal Difference
The majority of work on artificial neural networks (ANNs) playing the game of Go focus on network architectures and training regimes to improve the quality of the neural player. A less investigated problem is the board representation conveying the information on the current state of the game to the network. Common approaches suggest a straight-forward encoding by assigning each point on the board to a single (or more) input neurons. However, these basic representations do not capture elementary structural relationships between stones (and points) being essential to the game. We compare three different board representations for self-learning ANNs on a 5 times 5 board employing temporal difference learning (TDL) with two types of move selection (during training). The strength of the trained networks is evaluated in games against three computer players of different quality. A tournament of the best neural players, addition of alpha-beta search, and a commented game of a neural player against the best computer player further explore the potential of the neural players and its respective board representations