A Scalable Game Theoretic Approach for Coordination of Multiple Dynamic Systems

Mostafa M. Shibl, Vijay Gupta
{"title":"A Scalable Game Theoretic Approach for Coordination of Multiple Dynamic Systems","authors":"Mostafa M. Shibl, Vijay Gupta","doi":"arxiv-2409.11358","DOIUrl":null,"url":null,"abstract":"Learning in games provides a powerful framework to design control policies\nfor self-interested agents that may be coupled through their dynamics, costs,\nor constraints. We consider the case where the dynamics of the coupled system\ncan be modeled as a Markov potential game. In this case, distributed learning\nby the agents ensures that their control policies converge to a Nash\nequilibrium of this game. However, typical learning algorithms such as natural\npolicy gradient require knowledge of the entire global state and actions of all\nthe other agents, and may not be scalable as the number of agents grows. We\nshow that by limiting the information flow to a local neighborhood of agents in\nthe natural policy gradient algorithm, we can converge to a neighborhood of\noptimal policies. If the game can be designed through decomposing a global cost\nfunction of interest to a designer into local costs for the agents such that\ntheir policies at equilibrium optimize the global cost, this approach can be of\ninterest to team coordination problems as well. We illustrate our approach\nthrough a sensor coverage problem.","PeriodicalId":501175,"journal":{"name":"arXiv - EE - Systems and Control","volume":"3 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - EE - Systems and Control","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11358","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Learning in games provides a powerful framework to design control policies for self-interested agents that may be coupled through their dynamics, costs, or constraints. We consider the case where the dynamics of the coupled system can be modeled as a Markov potential game. In this case, distributed learning by the agents ensures that their control policies converge to a Nash equilibrium of this game. However, typical learning algorithms such as natural policy gradient require knowledge of the entire global state and actions of all the other agents, and may not be scalable as the number of agents grows. We show that by limiting the information flow to a local neighborhood of agents in the natural policy gradient algorithm, we can converge to a neighborhood of optimal policies. If the game can be designed through decomposing a global cost function of interest to a designer into local costs for the agents such that their policies at equilibrium optimize the global cost, this approach can be of interest to team coordination problems as well. We illustrate our approach through a sensor coverage problem.
多动态系统协调的可扩展博弈论方法
博弈中的学习提供了一个强大的框架,用于为自利代理设计控制策略,这些代理可能通过其动态、成本或约束条件而耦合在一起。我们考虑的情况是,耦合系统的动力学可以建模为马尔可夫势博弈。在这种情况下,代理的分布式学习可确保他们的控制策略收敛到博弈的下均衡。然而,典型的学习算法(如自然政策梯度法)需要了解全局状态和所有其他代理的行动,而且随着代理数量的增加可能无法扩展。我们看到,在自然策略梯度算法中,通过将信息流限制在代理的局部邻域,我们可以收敛到最优策略的邻域。如果可以通过将设计者感兴趣的全局成本函数分解为代理的局部成本来设计博弈,从而使代理在均衡状态下的策略能够优化全局成本,那么这种方法对团队协调问题也很有意义。我们通过一个传感器覆盖问题来说明我们的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信