Ziheng Wang , Xiandong Pu , Yulin Li, Jianlei Zhang, Chunyan Zhang
{"title":"Mean deep deterministic policy gradient algorithm for pursuit strategies in three-body confrontation","authors":"Ziheng Wang , Xiandong Pu , Yulin Li, Jianlei Zhang, Chunyan Zhang","doi":"10.1016/j.eswa.2025.128139","DOIUrl":null,"url":null,"abstract":"<div><div>Three-body confrontation is a challenging pursuit-evasion game with significant applications across various fields. Traditional methods based on differential game theory struggle to manage environmental complexity, imperfect information, and long-term decision-making. Leveraging the model-free approach and robust training capabilities of deep reinforcement learning, we propose an ensemble-based actor-critic algorithm named Augmented Mean Deep Deterministic Policy Gradient (AMDPG) to learn pursuit strategies in Three-body confrontation. This method includes an ensemble reinforcement learning architecture and incorporates multiple learning techniques to enhance its performance. Furthermore, we introduce an action-transform method that provides two prior strategies as heuristic guidance to accelerate action space exploration during learning. The proposed algorithm is evaluated in various scenarios, demonstrating superior policy performance and convergence compared to certain state-of-the-art algorithms. The learned strategies succeed in most testing scenarios, achieving higher penetration rates than its competitors.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"287 ","pages":"Article 128139"},"PeriodicalIF":7.5000,"publicationDate":"2025-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425017592","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Three-body confrontation is a challenging pursuit-evasion game with significant applications across various fields. Traditional methods based on differential game theory struggle to manage environmental complexity, imperfect information, and long-term decision-making. Leveraging the model-free approach and robust training capabilities of deep reinforcement learning, we propose an ensemble-based actor-critic algorithm named Augmented Mean Deep Deterministic Policy Gradient (AMDPG) to learn pursuit strategies in Three-body confrontation. This method includes an ensemble reinforcement learning architecture and incorporates multiple learning techniques to enhance its performance. Furthermore, we introduce an action-transform method that provides two prior strategies as heuristic guidance to accelerate action space exploration during learning. The proposed algorithm is evaluated in various scenarios, demonstrating superior policy performance and convergence compared to certain state-of-the-art algorithms. The learned strategies succeed in most testing scenarios, achieving higher penetration rates than its competitors.
期刊介绍:
Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.