Decision-Theoretic Clustering of Strategies

Nolan Bard, D. Nicholas, Csaba Szepesvari, Michael Bowling
{"title":"Decision-Theoretic Clustering of Strategies","authors":"Nolan Bard, D. Nicholas, Csaba Szepesvari, Michael Bowling","doi":"10.5555/2772879.2772886","DOIUrl":null,"url":null,"abstract":"Clustering agents by their behaviour can be crucial for building effective agent models. Traditional clustering typically aims to group entities together based on a distance metric, where a desirable clustering is one where the entities in a cluster are spatially close together. Instead, one may desire to cluster based on actionability, or the capacity for the clusters to suggest how an agent should respond to maximize their utility with respect to the entities. Segmentation problems examine this decision-theoretic clustering task. Although finding optimal solutions to these problems is computationally hard, greedy-based approximation algorithms exist. However, in settings where the agent has a combinatorially large number of candidate responses whose utilities must be considered, these algorithms are often intractable. In this work, we show that in many cases the utility function can be factored to allow for an efficient greedy algorithm even when there are exponentially large response spaces. We evaluate our technique theoretically, proving approximation bounds, and empirically using extensive-form games by clustering opponent strategies in toy poker games. Our results demonstrate that these techniques yield dramatically improved clusterings compared to a traditional distance-based clustering approach in terms of both subjective quality and utility obtained by responding to the clusters.","PeriodicalId":106568,"journal":{"name":"AAAI Workshop: Computer Poker and Imperfect Information","volume":"50 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"AAAI Workshop: Computer Poker and Imperfect Information","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5555/2772879.2772886","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

Abstract

Clustering agents by their behaviour can be crucial for building effective agent models. Traditional clustering typically aims to group entities together based on a distance metric, where a desirable clustering is one where the entities in a cluster are spatially close together. Instead, one may desire to cluster based on actionability, or the capacity for the clusters to suggest how an agent should respond to maximize their utility with respect to the entities. Segmentation problems examine this decision-theoretic clustering task. Although finding optimal solutions to these problems is computationally hard, greedy-based approximation algorithms exist. However, in settings where the agent has a combinatorially large number of candidate responses whose utilities must be considered, these algorithms are often intractable. In this work, we show that in many cases the utility function can be factored to allow for an efficient greedy algorithm even when there are exponentially large response spaces. We evaluate our technique theoretically, proving approximation bounds, and empirically using extensive-form games by clustering opponent strategies in toy poker games. Our results demonstrate that these techniques yield dramatically improved clusterings compared to a traditional distance-based clustering approach in terms of both subjective quality and utility obtained by responding to the clusters.
决策理论中的策略聚类
根据行为对代理进行聚类对于构建有效的代理模型至关重要。传统的聚类通常旨在基于距离度量将实体分组在一起,其中理想的聚类是集群中的实体在空间上靠近在一起。相反,人们可能希望基于可操作性或集群建议代理如何响应以最大化其相对于实体的效用的能力来进行集群。分割问题检验了这个决策理论聚类任务。虽然找到这些问题的最优解在计算上是困难的,但存在基于贪婪的近似算法。然而,在智能体有大量候选响应的情况下,这些算法通常是难以处理的。在这项工作中,我们表明,在许多情况下,即使存在指数级大的响应空间,效用函数也可以因式分解以允许有效的贪婪算法。我们从理论上评估我们的技术,证明近似界限,并通过在玩具扑克游戏中聚类对手策略来经验地使用广泛形式的游戏。我们的研究结果表明,与传统的基于距离的聚类方法相比,这些技术在主观质量和通过响应聚类获得的效用方面产生了显着改进的聚类。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信