AAAI Workshop: Computer Poker and Imperfect Information最新文献

筛选

英文中文

Decision-Theoretic Clustering of Strategies 决策理论中的策略聚类

AAAI Workshop: Computer Poker and Imperfect Information Pub Date : 2015-05-04 DOI: 10.5555/2772879.2772886

Nolan Bard, D. Nicholas, Csaba Szepesvari, Michael Bowling

{"title":"Decision-Theoretic Clustering of Strategies","authors":"Nolan Bard, D. Nicholas, Csaba Szepesvari, Michael Bowling","doi":"10.5555/2772879.2772886","DOIUrl":"https://doi.org/10.5555/2772879.2772886","url":null,"abstract":"Clustering agents by their behaviour can be crucial for building effective agent models. Traditional clustering typically aims to group entities together based on a distance metric, where a desirable clustering is one where the entities in a cluster are spatially close together. Instead, one may desire to cluster based on actionability, or the capacity for the clusters to suggest how an agent should respond to maximize their utility with respect to the entities. Segmentation problems examine this decision-theoretic clustering task. Although finding optimal solutions to these problems is computationally hard, greedy-based approximation algorithms exist. However, in settings where the agent has a combinatorially large number of candidate responses whose utilities must be considered, these algorithms are often intractable. In this work, we show that in many cases the utility function can be factored to allow for an efficient greedy algorithm even when there are exponentially large response spaces. We evaluate our technique theoretically, proving approximation bounds, and empirically using extensive-form games by clustering opponent strategies in toy poker games. Our results demonstrate that these techniques yield dramatically improved clusterings compared to a traditional distance-based clustering approach in terms of both subjective quality and utility obtained by responding to the clusters.","PeriodicalId":106568,"journal":{"name":"AAAI Workshop: Computer Poker and Imperfect Information","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121542143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Hierarchical Abstraction, Distributed Equilibrium Computation, and Post-Processing, with Application to a Champion No-Limit Texas Hold'em Agent 分层抽象、分布式均衡计算及后处理，并应用于冠军无限德州扑克智能体

AAAI Workshop: Computer Poker and Imperfect Information Pub Date : 2015-05-04 DOI: 10.5555/2772879.2772885

Noam Brown, Sam Ganzfried, T. Sandholm

{"title":"Hierarchical Abstraction, Distributed Equilibrium Computation, and Post-Processing, with Application to a Champion No-Limit Texas Hold'em Agent","authors":"Noam Brown, Sam Ganzfried, T. Sandholm","doi":"10.5555/2772879.2772885","DOIUrl":"https://doi.org/10.5555/2772879.2772885","url":null,"abstract":"The leading approach for solving large imperfect-information games is automated abstraction followed by running an equilibrium-finding algorithm. We introduce a distributed version of the most commonly used equilibrium-finding algorithm, counterfactual regret minimization (CFR), which enables CFR to scale to dramatically larger abstractions and numbers of cores. The new algorithm begets constraints on the abstraction so as to make the pieces running on different computers disjoint. We introduce an algorithm for generating such abstractions while capitalizing on state-of-the-art abstraction ideas such as imperfect recall and earth-mover's distance. Our techniques enabled an equilibrium computation of unprecedented size on a supercomputer with a high inter-blade memory latency. Prior approaches run slowly on this architecture. Our approach also leads to a significant improvement over using the prior best approach on a large shared-memory server with low memory latency. Finally, we introduce a family of post-processing techniques that outperform prior ones. We applied these techniques to generate an agent for two-player no-limit Texas Hold'em, called Tartanian7, that won the 2014 Annual Computer Poker Competition, beating each opponent with statistical significance.","PeriodicalId":106568,"journal":{"name":"AAAI Workshop: Computer Poker and Imperfect Information","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115892396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 65

Solving Games with Functional Regret Estimation 用功能性后悔估计解决游戏

AAAI Workshop: Computer Poker and Imperfect Information Pub Date : 2014-11-28 DOI: 10.1609/aaai.v29i1.9445

K. Waugh, Dustin Morrill, J. Bagnell, Michael Bowling

引用次数: 56