{"title":"论单调博弈中镜像博弈的变式解释","authors":"Yunian Pan, Tao Li, Quanyan Zhu","doi":"arxiv-2403.15636","DOIUrl":null,"url":null,"abstract":"Mirror play (MP) is a well-accepted primal-dual multi-agent learning\nalgorithm where all agents simultaneously implement mirror descent in a\ndistributed fashion. The advantage of MP over vanilla gradient play lies in its\nusage of mirror maps that better exploit the geometry of decision domains.\nDespite extensive literature dedicated to the asymptotic convergence of MP to\nequilibrium, the understanding of the finite-time behavior of MP before\nreaching equilibrium is still rudimentary. To facilitate the study of MP's\nnon-equilibrium performance, this work establishes an equivalence between MP's\nfinite-time primal-dual path (mirror path) in monotone games and the\nclosed-loop Nash equilibrium path of a finite-horizon differential game,\nreferred to as mirror differential game (MDG). Our construction of MDG rests on\nthe Brezis-Ekeland variational principle, and the stage cost functional for MDG\nis Fenchel coupling between MP's iterates and associated gradient updates. The\nvariational interpretation of mirror path in static games as the equilibrium\npath in MDG holds in deterministic and stochastic cases. Such a variational\ninterpretation translates the non-equilibrium studies of learning dynamics into\na more tractable equilibrium analysis of dynamic games, as demonstrated in a\ncase study on the Cournot game, where MP dynamics corresponds to a linear\nquadratic game.","PeriodicalId":501062,"journal":{"name":"arXiv - CS - Systems and Control","volume":"32 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"On the Variational Interpretation of Mirror Play in Monotone Games\",\"authors\":\"Yunian Pan, Tao Li, Quanyan Zhu\",\"doi\":\"arxiv-2403.15636\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Mirror play (MP) is a well-accepted primal-dual multi-agent learning\\nalgorithm where all agents simultaneously implement mirror descent in a\\ndistributed fashion. The advantage of MP over vanilla gradient play lies in its\\nusage of mirror maps that better exploit the geometry of decision domains.\\nDespite extensive literature dedicated to the asymptotic convergence of MP to\\nequilibrium, the understanding of the finite-time behavior of MP before\\nreaching equilibrium is still rudimentary. To facilitate the study of MP's\\nnon-equilibrium performance, this work establishes an equivalence between MP's\\nfinite-time primal-dual path (mirror path) in monotone games and the\\nclosed-loop Nash equilibrium path of a finite-horizon differential game,\\nreferred to as mirror differential game (MDG). Our construction of MDG rests on\\nthe Brezis-Ekeland variational principle, and the stage cost functional for MDG\\nis Fenchel coupling between MP's iterates and associated gradient updates. The\\nvariational interpretation of mirror path in static games as the equilibrium\\npath in MDG holds in deterministic and stochastic cases. Such a variational\\ninterpretation translates the non-equilibrium studies of learning dynamics into\\na more tractable equilibrium analysis of dynamic games, as demonstrated in a\\ncase study on the Cournot game, where MP dynamics corresponds to a linear\\nquadratic game.\",\"PeriodicalId\":501062,\"journal\":{\"name\":\"arXiv - CS - Systems and Control\",\"volume\":\"32 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-03-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Systems and Control\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2403.15636\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Systems and Control","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2403.15636","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
On the Variational Interpretation of Mirror Play in Monotone Games
Mirror play (MP) is a well-accepted primal-dual multi-agent learning
algorithm where all agents simultaneously implement mirror descent in a
distributed fashion. The advantage of MP over vanilla gradient play lies in its
usage of mirror maps that better exploit the geometry of decision domains.
Despite extensive literature dedicated to the asymptotic convergence of MP to
equilibrium, the understanding of the finite-time behavior of MP before
reaching equilibrium is still rudimentary. To facilitate the study of MP's
non-equilibrium performance, this work establishes an equivalence between MP's
finite-time primal-dual path (mirror path) in monotone games and the
closed-loop Nash equilibrium path of a finite-horizon differential game,
referred to as mirror differential game (MDG). Our construction of MDG rests on
the Brezis-Ekeland variational principle, and the stage cost functional for MDG
is Fenchel coupling between MP's iterates and associated gradient updates. The
variational interpretation of mirror path in static games as the equilibrium
path in MDG holds in deterministic and stochastic cases. Such a variational
interpretation translates the non-equilibrium studies of learning dynamics into
a more tractable equilibrium analysis of dynamic games, as demonstrated in a
case study on the Cournot game, where MP dynamics corresponds to a linear
quadratic game.