{"title":"基于深度限制的反事实后悔最小化和信念状态的不完全信息棋盘博弈学习策略","authors":"Chen Chen, Tomoyuki Kaneko","doi":"10.1109/CoG51982.2022.9893713","DOIUrl":null,"url":null,"abstract":"Counterfactual Regret Minimization (CFR) variants have mastered many Poker games by effectively handling a large number of opportunities in private information within relatively short playing histories of the game. However, for imperfect information board games with infrequent chance events but long histories or even loops, the effectiveness of CFR is often limited in practice as the computational complexity grows exponentially with the game length. In this paper, we propose Belief States with Approximation by Dirichlet Distributions and Depth-limited External Sampling for Board Games that enables an effective abstraction even with existence of loops. Experiments show that our proposed methods have the ability to learn reasonable strategies.","PeriodicalId":394281,"journal":{"name":"2022 IEEE Conference on Games (CoG)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Learning Strategies for Imperfect Information Board Games Using Depth-Limited Counterfactual Regret Minimization and Belief State\",\"authors\":\"Chen Chen, Tomoyuki Kaneko\",\"doi\":\"10.1109/CoG51982.2022.9893713\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Counterfactual Regret Minimization (CFR) variants have mastered many Poker games by effectively handling a large number of opportunities in private information within relatively short playing histories of the game. However, for imperfect information board games with infrequent chance events but long histories or even loops, the effectiveness of CFR is often limited in practice as the computational complexity grows exponentially with the game length. In this paper, we propose Belief States with Approximation by Dirichlet Distributions and Depth-limited External Sampling for Board Games that enables an effective abstraction even with existence of loops. Experiments show that our proposed methods have the ability to learn reasonable strategies.\",\"PeriodicalId\":394281,\"journal\":{\"name\":\"2022 IEEE Conference on Games (CoG)\",\"volume\":\"20 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-08-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE Conference on Games (CoG)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CoG51982.2022.9893713\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE Conference on Games (CoG)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CoG51982.2022.9893713","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Learning Strategies for Imperfect Information Board Games Using Depth-Limited Counterfactual Regret Minimization and Belief State
Counterfactual Regret Minimization (CFR) variants have mastered many Poker games by effectively handling a large number of opportunities in private information within relatively short playing histories of the game. However, for imperfect information board games with infrequent chance events but long histories or even loops, the effectiveness of CFR is often limited in practice as the computational complexity grows exponentially with the game length. In this paper, we propose Belief States with Approximation by Dirichlet Distributions and Depth-limited External Sampling for Board Games that enables an effective abstraction even with existence of loops. Experiments show that our proposed methods have the ability to learn reasonable strategies.