Regularization of the policy updates for stabilizing Mean Field Games

Talal Algumaei, Rubén Solozabal, Réda Alami, Hakim Hacid, M. Debbah, Martin Takác
{"title":"Regularization of the policy updates for stabilizing Mean Field Games","authors":"Talal Algumaei, Rubén Solozabal, Réda Alami, Hakim Hacid, M. Debbah, Martin Takác","doi":"10.48550/arXiv.2304.01547","DOIUrl":null,"url":null,"abstract":"This work studies non-cooperative Multi-Agent Reinforcement Learning (MARL) where multiple agents interact in the same environment and whose goal is to maximize the individual returns. Challenges arise when scaling up the number of agents due to the resultant non-stationarity that the many agents introduce. In order to address this issue, Mean Field Games (MFG) rely on the symmetry and homogeneity assumptions to approximate games with very large populations. Recently, deep Reinforcement Learning has been used to scale MFG to games with larger number of states. Current methods rely on smoothing techniques such as averaging the q-values or the updates on the mean-field distribution. This work presents a different approach to stabilize the learning based on proximal updates on the mean-field policy. We name our algorithm Mean Field Proximal Policy Optimization (MF-PPO), and we empirically show the effectiveness of our method in the OpenSpiel framework.","PeriodicalId":91995,"journal":{"name":"Advances in Knowledge Discovery and Data Mining : 21st Pacific-Asia Conference, PAKDD 2017, Jeju, South Korea, May 23-26, 2017, Proceedings. Part I. Pacific-Asia Conference on Knowledge Discovery and Data Mining (21st : 2017 : Cheju Isl...","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in Knowledge Discovery and Data Mining : 21st Pacific-Asia Conference, PAKDD 2017, Jeju, South Korea, May 23-26, 2017, Proceedings. Part I. Pacific-Asia Conference on Knowledge Discovery and Data Mining (21st : 2017 : Cheju Isl...","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2304.01547","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

This work studies non-cooperative Multi-Agent Reinforcement Learning (MARL) where multiple agents interact in the same environment and whose goal is to maximize the individual returns. Challenges arise when scaling up the number of agents due to the resultant non-stationarity that the many agents introduce. In order to address this issue, Mean Field Games (MFG) rely on the symmetry and homogeneity assumptions to approximate games with very large populations. Recently, deep Reinforcement Learning has been used to scale MFG to games with larger number of states. Current methods rely on smoothing techniques such as averaging the q-values or the updates on the mean-field distribution. This work presents a different approach to stabilize the learning based on proximal updates on the mean-field policy. We name our algorithm Mean Field Proximal Policy Optimization (MF-PPO), and we empirically show the effectiveness of our method in the OpenSpiel framework.
稳定平均场博弈的策略更新的正则化
本文研究了多智能体在同一环境中以个体收益最大化为目标进行交互的非合作多智能体强化学习(MARL)。当增加代理数量时,由于许多代理引入的非平稳性而产生挑战。为了解决这个问题,平均场游戏(Mean Field Games, MFG)依靠对称性和同质性假设来近似具有大量人口的游戏。最近,深度强化学习被用于将MFG扩展到具有更多状态的游戏。目前的方法依赖于平滑技术,如平均q值或更新平均场分布。这项工作提出了一种基于平均场策略的近端更新来稳定学习的不同方法。我们将我们的算法命名为平均域近端策略优化(MF-PPO),并在OpenSpiel框架中实证地证明了我们的方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信