单控制器马尔可夫游戏中的虚拟游戏

Proceedings of the 23rd ACM Conference on Economics and Computation Pub Date : 2022-05-23 DOI:10.1145/3490486.3538289

M. O. Sayin, K. Zhang, A. Ozdaglar

{"title":"单控制器马尔可夫游戏中的虚拟游戏","authors":"M. O. Sayin, K. Zhang, A. Ozdaglar","doi":"10.1145/3490486.3538289","DOIUrl":null,"url":null,"abstract":"Certain but important classes of strategic-form games, including zero-sum and identical-interest games, have thefictitious-play-property (FPP), i.e., beliefs formed in fictitious play dynamics always converge to a Nash equilibrium (NE) in the repeated play of these games. Such convergence results are seen as a (behavioral) justification for the game-theoretical equilibrium analysis. Markov games (MGs), also known as stochastic games, generalize the repeated play of strategic-form games to dynamic multi-state settings with Markovian state transitions. In particular, MGs are standard models for multi-agent reinforcement learning -- a reviving research area in learning and games, and their game-theoretical equilibrium analyses have also been conducted extensively. However, whether certain classes of MGs have the FPP or not (i.e., whether there is a behavioral justification for equilibrium analysis or not) remains largely elusive. In this paper, we study a new variant of fictitious play dynamics for MGs and show its convergence to an NE in n-player identical-interest MGs in which a single player controls the state transitions. Such games are of interest in communications, control, and economics applications. Our result together with the recent results in [42] establishes the FPP of two-player zero-sum MGs and n-player identical-interest MGs with a single controller (standing at two different ends of the MG spectrum from fully competitive to fully cooperative).","PeriodicalId":209859,"journal":{"name":"Proceedings of the 23rd ACM Conference on Economics and Computation","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Fictitious Play in Markov Games with Single Controller\",\"authors\":\"M. O. Sayin, K. Zhang, A. Ozdaglar\",\"doi\":\"10.1145/3490486.3538289\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Certain but important classes of strategic-form games, including zero-sum and identical-interest games, have thefictitious-play-property (FPP), i.e., beliefs formed in fictitious play dynamics always converge to a Nash equilibrium (NE) in the repeated play of these games. Such convergence results are seen as a (behavioral) justification for the game-theoretical equilibrium analysis. Markov games (MGs), also known as stochastic games, generalize the repeated play of strategic-form games to dynamic multi-state settings with Markovian state transitions. In particular, MGs are standard models for multi-agent reinforcement learning -- a reviving research area in learning and games, and their game-theoretical equilibrium analyses have also been conducted extensively. However, whether certain classes of MGs have the FPP or not (i.e., whether there is a behavioral justification for equilibrium analysis or not) remains largely elusive. In this paper, we study a new variant of fictitious play dynamics for MGs and show its convergence to an NE in n-player identical-interest MGs in which a single player controls the state transitions. Such games are of interest in communications, control, and economics applications. Our result together with the recent results in [42] establishes the FPP of two-player zero-sum MGs and n-player identical-interest MGs with a single controller (standing at two different ends of the MG spectrum from fully competitive to fully cooperative).\",\"PeriodicalId\":209859,\"journal\":{\"name\":\"Proceedings of the 23rd ACM Conference on Economics and Computation\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 23rd ACM Conference on Economics and Computation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3490486.3538289\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 23rd ACM Conference on Economics and Computation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3490486.3538289","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 10

摘要

某些重要的战略形式游戏，包括零和和相同利益游戏，都具有虚拟游戏属性(FPP)，即在虚拟游戏动态中形成的信念总是在这些游戏的重复游戏中收敛到纳什均衡(NE)。这种收敛结果被视为博弈论均衡分析的(行为)理由。马尔可夫博弈(MGs)，也被称为随机博弈，将策略形式博弈的重复玩法推广到具有马尔可夫状态转换的动态多状态设置。特别是，mg是多智能体强化学习的标准模型，这是学习和游戏中一个复兴的研究领域，它们的博弈论均衡分析也得到了广泛的应用。然而，某些类别的mg是否具有FPP(即，是否存在均衡分析的行为理由)在很大程度上仍然难以捉摸。在本文中，我们研究了虚拟游戏动力学的一种新变体，并证明了它在n-玩家相同兴趣的游戏中收敛到NE，其中单个玩家控制状态转换。这类游戏对通信、控制和经济应用很有兴趣。我们的研究结果与文献[42]中最近的研究结果一起建立了双玩家零和博弈和具有单一控制器的n玩家同利益博弈的FPP(从完全竞争到完全合作，处于博弈谱的两个不同端点)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Fictitious Play in Markov Games with Single Controller

Certain but important classes of strategic-form games, including zero-sum and identical-interest games, have thefictitious-play-property (FPP), i.e., beliefs formed in fictitious play dynamics always converge to a Nash equilibrium (NE) in the repeated play of these games. Such convergence results are seen as a (behavioral) justification for the game-theoretical equilibrium analysis. Markov games (MGs), also known as stochastic games, generalize the repeated play of strategic-form games to dynamic multi-state settings with Markovian state transitions. In particular, MGs are standard models for multi-agent reinforcement learning -- a reviving research area in learning and games, and their game-theoretical equilibrium analyses have also been conducted extensively. However, whether certain classes of MGs have the FPP or not (i.e., whether there is a behavioral justification for equilibrium analysis or not) remains largely elusive. In this paper, we study a new variant of fictitious play dynamics for MGs and show its convergence to an NE in n-player identical-interest MGs in which a single player controls the state transitions. Such games are of interest in communications, control, and economics applications. Our result together with the recent results in [42] establishes the FPP of two-player zero-sum MGs and n-player identical-interest MGs with a single controller (standing at two different ends of the MG spectrum from fully competitive to fully cooperative).

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 23rd ACM Conference on Economics and Computation

自引率

0.00%

发文量