Decision-Theoretic Clustering of Strategies

AAAI Workshop: Computer Poker and Imperfect Information Pub Date : 2015-05-04 DOI:10.5555/2772879.2772886

Nolan Bard, D. Nicholas, Csaba Szepesvari, Michael Bowling

{"title":"Decision-Theoretic Clustering of Strategies","authors":"Nolan Bard, D. Nicholas, Csaba Szepesvari, Michael Bowling","doi":"10.5555/2772879.2772886","DOIUrl":null,"url":null,"abstract":"Clustering agents by their behaviour can be crucial for building effective agent models. Traditional clustering typically aims to group entities together based on a distance metric, where a desirable clustering is one where the entities in a cluster are spatially close together. Instead, one may desire to cluster based on actionability, or the capacity for the clusters to suggest how an agent should respond to maximize their utility with respect to the entities. Segmentation problems examine this decision-theoretic clustering task. Although finding optimal solutions to these problems is computationally hard, greedy-based approximation algorithms exist. However, in settings where the agent has a combinatorially large number of candidate responses whose utilities must be considered, these algorithms are often intractable. In this work, we show that in many cases the utility function can be factored to allow for an efficient greedy algorithm even when there are exponentially large response spaces. We evaluate our technique theoretically, proving approximation bounds, and empirically using extensive-form games by clustering opponent strategies in toy poker games. Our results demonstrate that these techniques yield dramatically improved clusterings compared to a traditional distance-based clustering approach in terms of both subjective quality and utility obtained by responding to the clusters.","PeriodicalId":106568,"journal":{"name":"AAAI Workshop: Computer Poker and Imperfect Information","volume":"50 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"AAAI Workshop: Computer Poker and Imperfect Information","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5555/2772879.2772886","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 10

Abstract

Clustering agents by their behaviour can be crucial for building effective agent models. Traditional clustering typically aims to group entities together based on a distance metric, where a desirable clustering is one where the entities in a cluster are spatially close together. Instead, one may desire to cluster based on actionability, or the capacity for the clusters to suggest how an agent should respond to maximize their utility with respect to the entities. Segmentation problems examine this decision-theoretic clustering task. Although finding optimal solutions to these problems is computationally hard, greedy-based approximation algorithms exist. However, in settings where the agent has a combinatorially large number of candidate responses whose utilities must be considered, these algorithms are often intractable. In this work, we show that in many cases the utility function can be factored to allow for an efficient greedy algorithm even when there are exponentially large response spaces. We evaluate our technique theoretically, proving approximation bounds, and empirically using extensive-form games by clustering opponent strategies in toy poker games. Our results demonstrate that these techniques yield dramatically improved clusterings compared to a traditional distance-based clustering approach in terms of both subjective quality and utility obtained by responding to the clusters.

查看原文本刊更多论文

决策理论中的策略聚类

根据行为对代理进行聚类对于构建有效的代理模型至关重要。传统的聚类通常旨在基于距离度量将实体分组在一起，其中理想的聚类是集群中的实体在空间上靠近在一起。相反，人们可能希望基于可操作性或集群建议代理如何响应以最大化其相对于实体的效用的能力来进行集群。分割问题检验了这个决策理论聚类任务。虽然找到这些问题的最优解在计算上是困难的，但存在基于贪婪的近似算法。然而，在智能体有大量候选响应的情况下，这些算法通常是难以处理的。在这项工作中，我们表明，在许多情况下，即使存在指数级大的响应空间，效用函数也可以因式分解以允许有效的贪婪算法。我们从理论上评估我们的技术，证明近似界限，并通过在玩具扑克游戏中聚类对手策略来经验地使用广泛形式的游戏。我们的研究结果表明，与传统的基于距离的聚类方法相比，这些技术在主观质量和通过响应聚类获得的效用方面产生了显着改进的聚类。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

AAAI Workshop: Computer Poker and Imperfect Information

自引率

0.00%

发文量