{"title":"Research on space proximity pursuit-evasion interception decision-making based on deep reinforcement learning","authors":"Cheng Huang, Quanli Zeng, Jiazhong Xu","doi":"10.1016/j.mechatronics.2025.103387","DOIUrl":null,"url":null,"abstract":"<div><div>Aiming at the one-to-one pursuit-evasion problem in space, to successfully intercept the close-range evader with arbitrary counter-maneuver under relative motion between pursuer and evader at a close given range, this paper proposes a decision-making method for close-range pursuit-evasion interception based on Distributed Distributional Deep Determined Policy Gradient (D4PG). An improved nearest neighbor algorithm exploration mechanism including random constant and logarithmic constant is adopted, which reduces the learning burden of the algorithm and improves its convergence stability. A target network containing three value networks is constructed, and the loss function is calculated by selecting a value network with the minimum variance of probability distribution in the three networks, which enables the more accurate estimation of the Q-functions, and the operation speed and efficiency of the algorithm are effectively improved. Four typical escaping scenarios of arbitrary counter-maneuvering are performed as experimental verification to the simulation, and the results show the effectiveness and superiority of the proposed decision-making method for space proximity pursuit-evasion interception.</div></div>","PeriodicalId":49842,"journal":{"name":"Mechatronics","volume":"110 ","pages":"Article 103387"},"PeriodicalIF":3.1000,"publicationDate":"2025-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mechatronics","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957415825000960","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Aiming at the one-to-one pursuit-evasion problem in space, to successfully intercept the close-range evader with arbitrary counter-maneuver under relative motion between pursuer and evader at a close given range, this paper proposes a decision-making method for close-range pursuit-evasion interception based on Distributed Distributional Deep Determined Policy Gradient (D4PG). An improved nearest neighbor algorithm exploration mechanism including random constant and logarithmic constant is adopted, which reduces the learning burden of the algorithm and improves its convergence stability. A target network containing three value networks is constructed, and the loss function is calculated by selecting a value network with the minimum variance of probability distribution in the three networks, which enables the more accurate estimation of the Q-functions, and the operation speed and efficiency of the algorithm are effectively improved. Four typical escaping scenarios of arbitrary counter-maneuvering are performed as experimental verification to the simulation, and the results show the effectiveness and superiority of the proposed decision-making method for space proximity pursuit-evasion interception.
期刊介绍:
Mechatronics is the synergistic combination of precision mechanical engineering, electronic control and systems thinking in the design of products and manufacturing processes. It relates to the design of systems, devices and products aimed at achieving an optimal balance between basic mechanical structure and its overall control. The purpose of this journal is to provide rapid publication of topical papers featuring practical developments in mechatronics. It will cover a wide range of application areas including consumer product design, instrumentation, manufacturing methods, computer integration and process and device control, and will attract a readership from across the industrial and academic research spectrum. Particular importance will be attached to aspects of innovation in mechatronics design philosophy which illustrate the benefits obtainable by an a priori integration of functionality with embedded microprocessor control. A major item will be the design of machines, devices and systems possessing a degree of computer based intelligence. The journal seeks to publish research progress in this field with an emphasis on the applied rather than the theoretical. It will also serve the dual role of bringing greater recognition to this important area of engineering.