{"title":"具有强化学习功能的多人多代理图形游戏的纳什-最小策略","authors":"Bosen Lian;Wenqian Xue;Frank L. Lewis;Ali Davoudi","doi":"10.1109/TCNS.2024.3419823","DOIUrl":null,"url":null,"abstract":"In this article, we address the synchronization problem in multiplayer multiagent graphical games, where each agent has multiple control input players. Herein, an agent represents a system, and the agent's control input represents a player's outcome. We formulate a Nash-minmax strategy, where the interactions of players in the same agent are nonzero-sum, while interactions of players between agents are antagonistic (e.g., zero-sum game). That is, the players in each agent minimize their costs, while the players from neighboring agents go against and maximize the costs. This approach finds the Nash control solutions for players within each agent and the worst control solutions for players in neighboring agents. The asymptotic stability under mild conditions and Nash-minmax solutions are guaranteed in the games. Offline policy iteration and online data-driven off-policy reinforcement learning algorithms are proposed, with proven convergence, to compute the Nash-minmax solutions. A simulation example validates the proposed strategy and algorithms.","PeriodicalId":56023,"journal":{"name":"IEEE Transactions on Control of Network Systems","volume":"12 1","pages":"763-775"},"PeriodicalIF":4.0000,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Nash-Minmax Strategy for Multiplayer Multiagent Graphical Games With Reinforcement Learning\",\"authors\":\"Bosen Lian;Wenqian Xue;Frank L. Lewis;Ali Davoudi\",\"doi\":\"10.1109/TCNS.2024.3419823\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this article, we address the synchronization problem in multiplayer multiagent graphical games, where each agent has multiple control input players. Herein, an agent represents a system, and the agent's control input represents a player's outcome. We formulate a Nash-minmax strategy, where the interactions of players in the same agent are nonzero-sum, while interactions of players between agents are antagonistic (e.g., zero-sum game). That is, the players in each agent minimize their costs, while the players from neighboring agents go against and maximize the costs. This approach finds the Nash control solutions for players within each agent and the worst control solutions for players in neighboring agents. The asymptotic stability under mild conditions and Nash-minmax solutions are guaranteed in the games. Offline policy iteration and online data-driven off-policy reinforcement learning algorithms are proposed, with proven convergence, to compute the Nash-minmax solutions. A simulation example validates the proposed strategy and algorithms.\",\"PeriodicalId\":56023,\"journal\":{\"name\":\"IEEE Transactions on Control of Network Systems\",\"volume\":\"12 1\",\"pages\":\"763-775\"},\"PeriodicalIF\":4.0000,\"publicationDate\":\"2024-06-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Control of Network Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10574349/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Control of Network Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10574349/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
Nash-Minmax Strategy for Multiplayer Multiagent Graphical Games With Reinforcement Learning
In this article, we address the synchronization problem in multiplayer multiagent graphical games, where each agent has multiple control input players. Herein, an agent represents a system, and the agent's control input represents a player's outcome. We formulate a Nash-minmax strategy, where the interactions of players in the same agent are nonzero-sum, while interactions of players between agents are antagonistic (e.g., zero-sum game). That is, the players in each agent minimize their costs, while the players from neighboring agents go against and maximize the costs. This approach finds the Nash control solutions for players within each agent and the worst control solutions for players in neighboring agents. The asymptotic stability under mild conditions and Nash-minmax solutions are guaranteed in the games. Offline policy iteration and online data-driven off-policy reinforcement learning algorithms are proposed, with proven convergence, to compute the Nash-minmax solutions. A simulation example validates the proposed strategy and algorithms.
期刊介绍:
The IEEE Transactions on Control of Network Systems is committed to the timely publication of high-impact papers at the intersection of control systems and network science. In particular, the journal addresses research on the analysis, design and implementation of networked control systems, as well as control over networks. Relevant work includes the full spectrum from basic research on control systems to the design of engineering solutions for automatic control of, and over, networks. The topics covered by this journal include: Coordinated control and estimation over networks, Control and computation over sensor networks, Control under communication constraints, Control and performance analysis issues that arise in the dynamics of networks used in application areas such as communications, computers, transportation, manufacturing, Web ranking and aggregation, social networks, biology, power systems, economics, Synchronization of activities across a controlled network, Stability analysis of controlled networks, Analysis of networks as hybrid dynamical systems.