{"title":"Cooperative Advantage Actor–Critic Reinforcement Learning for Multiagent Pursuit-Evasion Games on Communication Graphs","authors":"Yizhen Meng;Chun Liu;Qiang Wang;Longyu Tan","doi":"10.1109/TAI.2024.3432511","DOIUrl":null,"url":null,"abstract":"This article investigates the distributed optimal strategy problem in multiagent pursuit-evasion (MPE) games, striving for Nash equilibrium through the optimization of individual benefit matrices based on observations. To this end, a novel collaborative control scheme for MPE games using communication graphs is proposed. This scheme employs cooperative advantage actor–critic (A2C) reinforcement learning to facilitate collaborative capture by pursuers in a distributed manner while maintaining bounded system signals. The strategy orchestrates the actions of pursuers through adaptive neural network learning, ensuring proximity-based collaboration for effective captures. Meanwhile, evaders aim to evade collectively by converging toward each other. Through extensive simulations involving five pursuers and two evaders, the efficacy of the proposed approach is demonstrated, and pursuers seamlessly organize into pursuit units and capture evaders, validating the collaborative capture objective. This article represents a promising step toward effective and cooperative control strategies in MPE game scenarios.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"5 12","pages":"6509-6523"},"PeriodicalIF":0.0000,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on artificial intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10606954/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This article investigates the distributed optimal strategy problem in multiagent pursuit-evasion (MPE) games, striving for Nash equilibrium through the optimization of individual benefit matrices based on observations. To this end, a novel collaborative control scheme for MPE games using communication graphs is proposed. This scheme employs cooperative advantage actor–critic (A2C) reinforcement learning to facilitate collaborative capture by pursuers in a distributed manner while maintaining bounded system signals. The strategy orchestrates the actions of pursuers through adaptive neural network learning, ensuring proximity-based collaboration for effective captures. Meanwhile, evaders aim to evade collectively by converging toward each other. Through extensive simulations involving five pursuers and two evaders, the efficacy of the proposed approach is demonstrated, and pursuers seamlessly organize into pursuit units and capture evaders, validating the collaborative capture objective. This article represents a promising step toward effective and cooperative control strategies in MPE game scenarios.