{"title":"Hybrid of Evolution and Reinforcement Learning for Othello Players","authors":"Kyung-Joong Kim, He-Seong Choi, Sung-Bae Cho","doi":"10.1109/CIG.2007.368099","DOIUrl":"https://doi.org/10.1109/CIG.2007.368099","url":null,"abstract":"Although the reinforcement learning and evolutionary algorithm show good results in board evaluation optimization, the hybrid of both approaches is rarely addressed in the literature. In this paper, the evolutionary algorithm is boosted using resources from the reinforcement learning. 1) The initialization of initial population using solution optimized by temporal difference learning 2) Exploitation of domain knowledge extracted from reinforcement learning. Experiments on Othello game strategies show that the proposed methods can effectively search the solution space and improve the performance","PeriodicalId":365269,"journal":{"name":"2007 IEEE Symposium on Computational Intelligence and Games","volume":"119 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130078672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Board Representations for Neural Go Players Learning by Temporal Difference","authors":"H. A. Mayer","doi":"10.1109/CIG.2007.368096","DOIUrl":"https://doi.org/10.1109/CIG.2007.368096","url":null,"abstract":"The majority of work on artificial neural networks (ANNs) playing the game of Go focus on network architectures and training regimes to improve the quality of the neural player. A less investigated problem is the board representation conveying the information on the current state of the game to the network. Common approaches suggest a straight-forward encoding by assigning each point on the board to a single (or more) input neurons. However, these basic representations do not capture elementary structural relationships between stones (and points) being essential to the game. We compare three different board representations for self-learning ANNs on a 5 times 5 board employing temporal difference learning (TDL) with two types of move selection (during training). The strength of the trained networks is evaluated in games against three computer players of different quality. A tournament of the best neural players, addition of alpha-beta search, and a commented game of a neural player against the best computer player further explore the potential of the neural players and its respective board representations","PeriodicalId":365269,"journal":{"name":"2007 IEEE Symposium on Computational Intelligence and Games","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122655043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Temporal Difference Learning of an Othello Evaluation Function for a Small Neural Network with Shared Weights","authors":"E. Manning","doi":"10.1109/CIG.2007.368101","DOIUrl":"https://doi.org/10.1109/CIG.2007.368101","url":null,"abstract":"This paper presents an artificial neural network with shared weights, trained to play the game of Othello by self-play with temporal difference learning (TDL). The network performs as well as the champion of the CEC 2006 Othello Evaluation Function Competition. The TDL-trained network contains only 67 unique weights compared to 2113 for the champion","PeriodicalId":365269,"journal":{"name":"2007 IEEE Symposium on Computational Intelligence and Games","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121290473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Using Stochastic AI Techniques to Achieve Unbounded Resolution in Finite Player Goore Games and its Applications","authors":"B. Oommen, Ole-Christoffer Granmo, A. Pedersen","doi":"10.1109/CIG.2007.368093","DOIUrl":"https://doi.org/10.1109/CIG.2007.368093","url":null,"abstract":"The Goore Game (GG) introduced by M. L. Tsetlin in 1973 has the fascinating property that it can be resolved in a completely distributed manner with no intercommunication between the players. The game has recently found applications in many domains, including the field of sensor networks and quality-of-service (QoS) routing. In actual implementations of the solution, the players are typically replaced by learning automata (LA). The problem with the existing reported approaches is that the accuracy of the solution achieved is intricately related to the number of players participating in the game -which, in turn, determines the resolution. In other words, an arbitrary accuracy can be obtained only if the game has an infinite number of players. In this paper, we show how we can attain an unbounded accuracy for the GG by utilizing no more than three stochastic learning machines, and by recursively pruning the solution space to guarantee that the retained domain contains the solution to the game with a probability as close to unity as desired. The paper also conjectures on how the solution can be applied to some of the application domains","PeriodicalId":365269,"journal":{"name":"2007 IEEE Symposium on Computational Intelligence and Games","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116350491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Concept Accessibility as Basis for Evolutionary Reinforcement Learning of Dots and Boxes","authors":"Anthony Knittel, T. Bossomaier, A. Snyder","doi":"10.1109/CIG.2007.368090","DOIUrl":"https://doi.org/10.1109/CIG.2007.368090","url":null,"abstract":"The challenge of creating teams of agents, which evolve or learn, to solve complex problems is addressed in the combinatorially complex game of dots and boxes (strings and coins). Previous evolutionary reinforcement learning (ERL) systems approaching this task based on dynamic agent populations have shown some degree of success in game play, however are sensitive to conditions and suffer from unstable agent populations under difficult play and poor development against an easier opponent. A novel technique for preserving stability and allowing balance of specialised and generalised rules in an ERL system is presented, motivated by accessibility of concepts in human cognition, as opposed to natural selection through population survivability common to ERL systems. Reinforcement learning in dynamic teams of mutable agents enables play comparable to hand-crafted artificial players. Performance and stability of development is enhanced when a measure of the frequency of reinforcement is separated from the quality measure of rules","PeriodicalId":365269,"journal":{"name":"2007 IEEE Symposium on Computational Intelligence and Games","volume":"173 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133444072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Reward Allotment Considered Roles for Learning Classifier System For Soccer Video Games","authors":"Yosuke Akatsuka, Yuji Sato","doi":"10.1109/CIG.2007.368111","DOIUrl":"https://doi.org/10.1109/CIG.2007.368111","url":null,"abstract":"In recent years, the video-game environment has begun to change due to the explosive growth of the Internet. As a result, it makes the time for maintenance longer and the development cost increased. In addition, the life cycle of the game program shortens. To solve the above-mentioned problem, we have already proposed the event-driven hybrid learning classifier system and showed that the system is effective to improving the game winning rate and making the learning time shorten. This paper describes the investigation result of the effect in case we apply the reward allotment considered each role for classifier learning system. Concretely, we investigate the influence to each player's actions by changing the algorithm of the opponent and to team strategy by changing reward setting, and analyze them. As a result, we show that the influence of learning effects to each player's actions does not depend on the algorithm of opponent. And we also show that the reward allotment considered each role has possible to evolve the game strategy to improving the game winning rate","PeriodicalId":365269,"journal":{"name":"2007 IEEE Symposium on Computational Intelligence and Games","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132050473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Evolutionary Computations for Designing Game Rules of the COMMONS GAME","authors":"H. Handa, N. Baba","doi":"10.1109/CIG.2007.368117","DOIUrl":"https://doi.org/10.1109/CIG.2007.368117","url":null,"abstract":"In this paper, we focus on game rule design by using two evolutionary computations. The first EC is a multi-objective evolutionary algorithm in order to generate various skilled players. By using acquired skilled players, i.e., Pareto individuals in MOEA, another EC (evolutionary programming) adjusts game rule parameters i.e, an appropriate point of each card in the COMMONS GAME","PeriodicalId":365269,"journal":{"name":"2007 IEEE Symposium on Computational Intelligence and Games","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130310639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Game AI in Delta3D","authors":"C. Darken, Bradley G. Anderegg, Perry McDowell","doi":"10.1109/CIG.2007.368114","DOIUrl":"https://doi.org/10.1109/CIG.2007.368114","url":null,"abstract":"Delta3D is a GNU-licensed open source game engine with an orientation towards supporting \"serious games\" such as those with defense and homeland security applications. AI is an important issue for serious games, since there is more pressure to \"get the AI right\", as opposed to providing an entertaining user experience. We describe several of our near-and longer-term AI projects oriented towards making it easier to build AI-enhanced applications in Delta3D.","PeriodicalId":365269,"journal":{"name":"2007 IEEE Symposium on Computational Intelligence and Games","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122394037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Evolving Players for an Ancient Game: Hnefatafl","authors":"P. Hingston","doi":"10.1109/CIG.2007.368094","DOIUrl":"https://doi.org/10.1109/CIG.2007.368094","url":null,"abstract":"Hnefatafl is an ancient Norse game - an ancestor of chess. In this paper, we report on the development of computer players for this game. In the spirit of Blondie24, we evolve neural networks as board evaluation functions for different versions of the game. An unusual aspect of this game is that there is no general agreement on the rules: it is no longer much played, and game historians attempt to infer the rules from scraps of historical texts, with ambiguities often resolved on gut feeling as to what the rules must have been in order to achieve a balanced game. We offer the evolutionary method as a means by which to judge the merits of alternative rule sets","PeriodicalId":365269,"journal":{"name":"2007 IEEE Symposium on Computational Intelligence and Games","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128611792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Evolving Pac-Man Players: Can We Learn from Raw Input?","authors":"M. Gallagher, M. Ledwich","doi":"10.1109/CIG.2007.368110","DOIUrl":"https://doi.org/10.1109/CIG.2007.368110","url":null,"abstract":"Pac-Man (and variant) computer games have received some recent attention in artificial intelligence research. One reason is that the game provides a platform that is both simple enough to conduct experimental research and complex enough to require non-trivial strategies for successful game-play. This paper describes an approach to developing Pac-Man playing agents that learn game-play based on minimal onscreen information. The agents are based on evolving neural network controllers using a simple evolutionary algorithm. The results show that neuroevolution is able to produce agents that display novice playing ability, with a minimal amount of onscreen information, no knowledge of the rules of the game and a minimally informative fitness function. The limitations of the approach are also discussed, together with possible directions for extending the work towards producing better Pac-Man playing agents","PeriodicalId":365269,"journal":{"name":"2007 IEEE Symposium on Computational Intelligence and Games","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114681532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}