广义相关均衡与粗相关均衡的快速无遗憾学习动力学

Proceedings of the 23rd ACM Conference on Economics and Computation Pub Date : 2022-02-11 DOI:10.1145/3490486.3538288

Ioannis Anagnostides, Gabriele Farina, Christian Kroer, A. Celli, T. Sandholm

{"title":"广义相关均衡与粗相关均衡的快速无遗憾学习动力学","authors":"Ioannis Anagnostides, Gabriele Farina, Christian Kroer, A. Celli, T. Sandholm","doi":"10.1145/3490486.3538288","DOIUrl":null,"url":null,"abstract":"A recent emerging trend in the literature on learning in games has been concerned with providing faster learning dynamics for correlated and coarse correlated equilibria in normal-form games. Much less is known about the significantly more challenging setting of extensive-form games, which can capture both sequential and simultaneous moves, as well as imperfect information. In this paper we establish faster no-regret learning dynamics forextensive-form correlated equilibria (EFCE) in multiplayer general-sum imperfect-information extensive-form games. When all players follow our accelerated dynamics, the correlated distribution of play is an O(T-3/4)-approximate EFCE, where the O(·) notation suppresses parameters polynomial in the description of the game. This significantly improves over the best prior rate of O(T-1/2 ). To achieve this, we develop a framework for performing accelerated Phi-regret minimization via predictions. One of our key technical contributions---that enables us to employ our generic template---is to characterize the stability of fixed points associated with trigger deviation functions through a refined perturbation analysis of a structured Markov chain. Furthermore, for the simpler solution concept of extensive-form coarse correlated equilibrium (EFCCE) we give a new succinct closed-form characterization of the associated fixed points, bypassing the expensive computation of stationary distributions required for EFCE. Our results place EFCCE closer to normal-form coarse correlated equilibria in terms of the per-iteration complexity, although the former prescribes a much more compelling notion of correlation. Finally, experiments conducted on standard benchmarks corroborate our theoretical findings.","PeriodicalId":209859,"journal":{"name":"Proceedings of the 23rd ACM Conference on Economics and Computation","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Faster No-Regret Learning Dynamics for Extensive-Form Correlated and Coarse Correlated Equilibria\",\"authors\":\"Ioannis Anagnostides, Gabriele Farina, Christian Kroer, A. Celli, T. Sandholm\",\"doi\":\"10.1145/3490486.3538288\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A recent emerging trend in the literature on learning in games has been concerned with providing faster learning dynamics for correlated and coarse correlated equilibria in normal-form games. Much less is known about the significantly more challenging setting of extensive-form games, which can capture both sequential and simultaneous moves, as well as imperfect information. In this paper we establish faster no-regret learning dynamics forextensive-form correlated equilibria (EFCE) in multiplayer general-sum imperfect-information extensive-form games. When all players follow our accelerated dynamics, the correlated distribution of play is an O(T-3/4)-approximate EFCE, where the O(·) notation suppresses parameters polynomial in the description of the game. This significantly improves over the best prior rate of O(T-1/2 ). To achieve this, we develop a framework for performing accelerated Phi-regret minimization via predictions. One of our key technical contributions---that enables us to employ our generic template---is to characterize the stability of fixed points associated with trigger deviation functions through a refined perturbation analysis of a structured Markov chain. Furthermore, for the simpler solution concept of extensive-form coarse correlated equilibrium (EFCCE) we give a new succinct closed-form characterization of the associated fixed points, bypassing the expensive computation of stationary distributions required for EFCE. Our results place EFCCE closer to normal-form coarse correlated equilibria in terms of the per-iteration complexity, although the former prescribes a much more compelling notion of correlation. Finally, experiments conducted on standard benchmarks corroborate our theoretical findings.\",\"PeriodicalId\":209859,\"journal\":{\"name\":\"Proceedings of the 23rd ACM Conference on Economics and Computation\",\"volume\":\"47 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-02-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 23rd ACM Conference on Economics and Computation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3490486.3538288\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 23rd ACM Conference on Economics and Computation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3490486.3538288","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

摘要

最近关于游戏学习的文献中出现了一个新趋势，即为正常形式游戏中的相关均衡和粗相关均衡提供更快的学习动态。而对于具有更大挑战性的游戏设置，我们所知甚少，因为它既可以捕捉顺序移动，也可以捕捉同步移动，以及不完全信息。本文建立了多人一般和不完全信息泛化博弈中泛化相关均衡的快速无遗憾学习动态。当所有玩家都遵循我们的加速动态时，游戏的相关分布是一个O(T-3/4)近似的EFCE，其中O(·)符号抑制了游戏描述中的参数多项式。这大大提高了最佳的先验率0 (T-1/2)。为了实现这一点，我们开发了一个框架，通过预测来执行加速的pi -遗憾最小化。我们的关键技术贡献之一——使我们能够使用我们的通用模板——是通过对结构化马尔可夫链的精细扰动分析来表征与触发偏差函数相关的不动点的稳定性。此外，对于广义粗相关平衡(EFCCE)的简单解概念，我们给出了相关不动点的一个新的简洁的闭形式表征，从而绕过了EFCE所需的昂贵的平稳分布计算。我们的结果使EFCCE在每次迭代的复杂性方面更接近于标准形式的粗相关平衡，尽管前者规定了一个更引人注目的相关概念。最后，在标准基准上进行的实验证实了我们的理论发现。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Faster No-Regret Learning Dynamics for Extensive-Form Correlated and Coarse Correlated Equilibria

A recent emerging trend in the literature on learning in games has been concerned with providing faster learning dynamics for correlated and coarse correlated equilibria in normal-form games. Much less is known about the significantly more challenging setting of extensive-form games, which can capture both sequential and simultaneous moves, as well as imperfect information. In this paper we establish faster no-regret learning dynamics forextensive-form correlated equilibria (EFCE) in multiplayer general-sum imperfect-information extensive-form games. When all players follow our accelerated dynamics, the correlated distribution of play is an O(T-3/4)-approximate EFCE, where the O(·) notation suppresses parameters polynomial in the description of the game. This significantly improves over the best prior rate of O(T-1/2 ). To achieve this, we develop a framework for performing accelerated Phi-regret minimization via predictions. One of our key technical contributions---that enables us to employ our generic template---is to characterize the stability of fixed points associated with trigger deviation functions through a refined perturbation analysis of a structured Markov chain. Furthermore, for the simpler solution concept of extensive-form coarse correlated equilibrium (EFCCE) we give a new succinct closed-form characterization of the associated fixed points, bypassing the expensive computation of stationary distributions required for EFCE. Our results place EFCCE closer to normal-form coarse correlated equilibria in terms of the per-iteration complexity, although the former prescribes a much more compelling notion of correlation. Finally, experiments conducted on standard benchmarks corroborate our theoretical findings.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 23rd ACM Conference on Economics and Computation

自引率

0.00%

发文量