An introduction to learning in finite games

J. Shamma
{"title":"An introduction to learning in finite games","authors":"J. Shamma","doi":"10.23919/ACC55779.2023.10156273","DOIUrl":null,"url":null,"abstract":"In the setting of learning in games, player strategies evolve in an effort to maximize utility in response to the evolving strategies of other players. In contrast to the single agent case, learning in the presence of other learners induces a non-stationary environment from the perspective of any individual player. Depending on the specifics of the game and the learning dynamics, the evolving strategies may exhibit a variety of behaviors ranging from convergence to Nash equilibrium to oscillations to even chaos. This talk presents a basic introduction to learning in games through the presentation of selected results for finite normal form games, i.e., games with a finite number of players having a finite number of actions. The talk starts with a representative sample of learning dynamics that converge to Nash equilibrium for special classes of games. Specific learning dynamics include better reply dynamics, joint strategy fictitious play, and log-linear learning, with results for potential games and weakly acyclic games. These results apply to specifically pure Nash equilibrium. The talk also presents dynamics that address mixed/randomized strategy Nash equilibria, specifically smooth fictitious play and gradient play. The talk concludes with limitations in learning that stem from the notion of uncoupled dynamics, where a player’s learning dynamics cannot depend explicitly on the utility functions of other players.","PeriodicalId":397401,"journal":{"name":"2023 American Control Conference (ACC)","volume":"94 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 American Control Conference (ACC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/ACC55779.2023.10156273","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In the setting of learning in games, player strategies evolve in an effort to maximize utility in response to the evolving strategies of other players. In contrast to the single agent case, learning in the presence of other learners induces a non-stationary environment from the perspective of any individual player. Depending on the specifics of the game and the learning dynamics, the evolving strategies may exhibit a variety of behaviors ranging from convergence to Nash equilibrium to oscillations to even chaos. This talk presents a basic introduction to learning in games through the presentation of selected results for finite normal form games, i.e., games with a finite number of players having a finite number of actions. The talk starts with a representative sample of learning dynamics that converge to Nash equilibrium for special classes of games. Specific learning dynamics include better reply dynamics, joint strategy fictitious play, and log-linear learning, with results for potential games and weakly acyclic games. These results apply to specifically pure Nash equilibrium. The talk also presents dynamics that address mixed/randomized strategy Nash equilibria, specifically smooth fictitious play and gradient play. The talk concludes with limitations in learning that stem from the notion of uncoupled dynamics, where a player’s learning dynamics cannot depend explicitly on the utility functions of other players.
在有限游戏中学习的介绍
在游戏学习的背景下,玩家的策略演变是为了响应其他玩家的策略演变而最大化效用。与单智能体情况相反,从任何个体参与者的角度来看,在其他学习者存在的情况下学习导致了一个非平稳环境。根据博弈的具体情况和学习动态,进化策略可能会表现出从收敛到纳什均衡到振荡甚至混乱的各种行为。这次演讲将通过展示有限标准形式游戏(游戏邦注:即具有有限数量玩家和有限数量行动的游戏)的选择结果来介绍游戏中的学习。演讲以一个典型的学习动力学样本开始,该样本收敛于特殊类型游戏的纳什均衡。具体的学习动力学包括更好的回复动力学、联合策略虚拟游戏和对数线性学习,其结果是潜在游戏和弱无循环游戏。这些结果特别适用于纯纳什均衡。讲座还介绍了混合/随机策略纳什均衡的动态,特别是平滑虚拟玩法和梯度玩法。该演讲总结了源自非耦合动态概念的学习局限性,即玩家的学习动态不能明确地依赖于其他玩家的效用函数。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信