Bo Lu, Le Ru, Maolong Lv, Shiguang Hu, Hongguo Zhang, Zilong Zhao
{"title":"Enhanced LSTM-DQN algorithm for a two-player zero-sum game in three-dimensional space","authors":"Bo Lu, Le Ru, Maolong Lv, Shiguang Hu, Hongguo Zhang, Zilong Zhao","doi":"10.1049/cth2.12677","DOIUrl":null,"url":null,"abstract":"<p>To tackle the challenges presented by the two-player zero sum game (TZSG) in three-dimensional space, this study introduces an enhanced deep Q-learning (DQN) algorithm that utilizes long short term memory (LSTM) network. The primary objective of this algorithm is to enhance the temporal correlation of the TZSG in three-dimensional space. Additionally, it incorporates the hindsight experience replay (HER) mechanism to improve the learning efficiency of the network and mitigate the issue of the “sparse reward” that arises from prolonged training of intelligence in solving the TZSG in the three-dimensional. Furthermore, this method enhances the convergence and stability of the overall solution.An intelligent training environment centred around an airborne agent and its mutual pursuit interaction scenario was designed to proposed approach's effectiveness. The algorithm training and comparison results show that the LSTM-DQN-HER algorithm outperforms similar algorithm in solving the TZSG in three-dimensional space. In conclusion, this paper presents an improved DQN algorithm based on LSTM and incorporates the HER mechanism to address the challenges posed by the TZSG in three-dimensional space. The proposed algorithm enhances the solution's temporal correlation, learning efficiency, convergence, and stability. The simulation results confirm its superior performance in solving the TZSG in three-dimensional space.</p>","PeriodicalId":50382,"journal":{"name":"IET Control Theory and Applications","volume":"18 18","pages":"2798-2812"},"PeriodicalIF":2.2000,"publicationDate":"2024-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cth2.12677","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IET Control Theory and Applications","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/cth2.12677","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
To tackle the challenges presented by the two-player zero sum game (TZSG) in three-dimensional space, this study introduces an enhanced deep Q-learning (DQN) algorithm that utilizes long short term memory (LSTM) network. The primary objective of this algorithm is to enhance the temporal correlation of the TZSG in three-dimensional space. Additionally, it incorporates the hindsight experience replay (HER) mechanism to improve the learning efficiency of the network and mitigate the issue of the “sparse reward” that arises from prolonged training of intelligence in solving the TZSG in the three-dimensional. Furthermore, this method enhances the convergence and stability of the overall solution.An intelligent training environment centred around an airborne agent and its mutual pursuit interaction scenario was designed to proposed approach's effectiveness. The algorithm training and comparison results show that the LSTM-DQN-HER algorithm outperforms similar algorithm in solving the TZSG in three-dimensional space. In conclusion, this paper presents an improved DQN algorithm based on LSTM and incorporates the HER mechanism to address the challenges posed by the TZSG in three-dimensional space. The proposed algorithm enhances the solution's temporal correlation, learning efficiency, convergence, and stability. The simulation results confirm its superior performance in solving the TZSG in three-dimensional space.
期刊介绍:
IET Control Theory & Applications is devoted to control systems in the broadest sense, covering new theoretical results and the applications of new and established control methods. Among the topics of interest are system modelling, identification and simulation, the analysis and design of control systems (including computer-aided design), and practical implementation. The scope encompasses technological, economic, physiological (biomedical) and other systems, including man-machine interfaces.
Most of the papers published deal with original work from industrial and government laboratories and universities, but subject reviews and tutorial expositions of current methods are welcomed. Correspondence discussing published papers is also welcomed.
Applications papers need not necessarily involve new theory. Papers which describe new realisations of established methods, or control techniques applied in a novel situation, or practical studies which compare various designs, would be of interest. Of particular value are theoretical papers which discuss the applicability of new work or applications which engender new theoretical applications.