论单调博弈中镜像博弈的变式解释

Yunian Pan, Tao Li, Quanyan Zhu
{"title":"论单调博弈中镜像博弈的变式解释","authors":"Yunian Pan, Tao Li, Quanyan Zhu","doi":"arxiv-2403.15636","DOIUrl":null,"url":null,"abstract":"Mirror play (MP) is a well-accepted primal-dual multi-agent learning\nalgorithm where all agents simultaneously implement mirror descent in a\ndistributed fashion. The advantage of MP over vanilla gradient play lies in its\nusage of mirror maps that better exploit the geometry of decision domains.\nDespite extensive literature dedicated to the asymptotic convergence of MP to\nequilibrium, the understanding of the finite-time behavior of MP before\nreaching equilibrium is still rudimentary. To facilitate the study of MP's\nnon-equilibrium performance, this work establishes an equivalence between MP's\nfinite-time primal-dual path (mirror path) in monotone games and the\nclosed-loop Nash equilibrium path of a finite-horizon differential game,\nreferred to as mirror differential game (MDG). Our construction of MDG rests on\nthe Brezis-Ekeland variational principle, and the stage cost functional for MDG\nis Fenchel coupling between MP's iterates and associated gradient updates. The\nvariational interpretation of mirror path in static games as the equilibrium\npath in MDG holds in deterministic and stochastic cases. Such a variational\ninterpretation translates the non-equilibrium studies of learning dynamics into\na more tractable equilibrium analysis of dynamic games, as demonstrated in a\ncase study on the Cournot game, where MP dynamics corresponds to a linear\nquadratic game.","PeriodicalId":501062,"journal":{"name":"arXiv - CS - Systems and Control","volume":"32 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"On the Variational Interpretation of Mirror Play in Monotone Games\",\"authors\":\"Yunian Pan, Tao Li, Quanyan Zhu\",\"doi\":\"arxiv-2403.15636\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Mirror play (MP) is a well-accepted primal-dual multi-agent learning\\nalgorithm where all agents simultaneously implement mirror descent in a\\ndistributed fashion. The advantage of MP over vanilla gradient play lies in its\\nusage of mirror maps that better exploit the geometry of decision domains.\\nDespite extensive literature dedicated to the asymptotic convergence of MP to\\nequilibrium, the understanding of the finite-time behavior of MP before\\nreaching equilibrium is still rudimentary. To facilitate the study of MP's\\nnon-equilibrium performance, this work establishes an equivalence between MP's\\nfinite-time primal-dual path (mirror path) in monotone games and the\\nclosed-loop Nash equilibrium path of a finite-horizon differential game,\\nreferred to as mirror differential game (MDG). Our construction of MDG rests on\\nthe Brezis-Ekeland variational principle, and the stage cost functional for MDG\\nis Fenchel coupling between MP's iterates and associated gradient updates. The\\nvariational interpretation of mirror path in static games as the equilibrium\\npath in MDG holds in deterministic and stochastic cases. Such a variational\\ninterpretation translates the non-equilibrium studies of learning dynamics into\\na more tractable equilibrium analysis of dynamic games, as demonstrated in a\\ncase study on the Cournot game, where MP dynamics corresponds to a linear\\nquadratic game.\",\"PeriodicalId\":501062,\"journal\":{\"name\":\"arXiv - CS - Systems and Control\",\"volume\":\"32 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-03-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Systems and Control\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2403.15636\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Systems and Control","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2403.15636","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

镜像演算(MP)是一种广受认可的原始双多代理学习算法,在这种算法中,所有代理同时以分布式方式实施镜像下降。与虚无梯度算法相比,镜像算法的优势在于它使用的镜像映射能更好地利用决策域的几何形状。尽管有大量文献致力于研究镜像算法向均衡的渐进收敛,但人们对镜像算法在达到均衡之前的有限时间行为的理解仍然很不成熟。为了便于研究 MP 的非均衡表现,本文建立了单调博弈中 MP 的有限时间初等双路径(镜像路径)与有限视距微分博弈的闭环纳什均衡路径之间的等价关系,即镜像微分博弈(MDG)。我们对 MDG 的构建基于 Brezis-Ekeland 变分原理,MDG 的阶段代价函数是 MP 的迭代与相关梯度更新之间的 Fenchel 耦合。将静态博弈中的镜像路径变式解释为 MDG 中的均衡路径,在确定性和随机情况下都成立。这种变分解释将学习动态的非均衡研究转化为动态博弈中更易操作的均衡分析,正如库诺博弈的案例研究所示,MP 动态对应于线性二次博弈。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
On the Variational Interpretation of Mirror Play in Monotone Games
Mirror play (MP) is a well-accepted primal-dual multi-agent learning algorithm where all agents simultaneously implement mirror descent in a distributed fashion. The advantage of MP over vanilla gradient play lies in its usage of mirror maps that better exploit the geometry of decision domains. Despite extensive literature dedicated to the asymptotic convergence of MP to equilibrium, the understanding of the finite-time behavior of MP before reaching equilibrium is still rudimentary. To facilitate the study of MP's non-equilibrium performance, this work establishes an equivalence between MP's finite-time primal-dual path (mirror path) in monotone games and the closed-loop Nash equilibrium path of a finite-horizon differential game, referred to as mirror differential game (MDG). Our construction of MDG rests on the Brezis-Ekeland variational principle, and the stage cost functional for MDG is Fenchel coupling between MP's iterates and associated gradient updates. The variational interpretation of mirror path in static games as the equilibrium path in MDG holds in deterministic and stochastic cases. Such a variational interpretation translates the non-equilibrium studies of learning dynamics into a more tractable equilibrium analysis of dynamic games, as demonstrated in a case study on the Cournot game, where MP dynamics corresponds to a linear quadratic game.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信