Research on space proximity pursuit-evasion interception decision-making based on deep reinforcement learning

IF 3.1 3区计算机科学 Q2 AUTOMATION & CONTROL SYSTEMS

Mechatronics Pub Date : 2025-08-05 DOI:10.1016/j.mechatronics.2025.103387

Cheng Huang, Quanli Zeng, Jiazhong Xu

{"title":"Research on space proximity pursuit-evasion interception decision-making based on deep reinforcement learning","authors":"Cheng Huang, Quanli Zeng, Jiazhong Xu","doi":"10.1016/j.mechatronics.2025.103387","DOIUrl":null,"url":null,"abstract":"<div><div>Aiming at the one-to-one pursuit-evasion problem in space, to successfully intercept the close-range evader with arbitrary counter-maneuver under relative motion between pursuer and evader at a close given range, this paper proposes a decision-making method for close-range pursuit-evasion interception based on Distributed Distributional Deep Determined Policy Gradient (D4PG). An improved nearest neighbor algorithm exploration mechanism including random constant and logarithmic constant is adopted, which reduces the learning burden of the algorithm and improves its convergence stability. A target network containing three value networks is constructed, and the loss function is calculated by selecting a value network with the minimum variance of probability distribution in the three networks, which enables the more accurate estimation of the Q-functions, and the operation speed and efficiency of the algorithm are effectively improved. Four typical escaping scenarios of arbitrary counter-maneuvering are performed as experimental verification to the simulation, and the results show the effectiveness and superiority of the proposed decision-making method for space proximity pursuit-evasion interception.</div></div>","PeriodicalId":49842,"journal":{"name":"Mechatronics","volume":"110 ","pages":"Article 103387"},"PeriodicalIF":3.1000,"publicationDate":"2025-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mechatronics","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957415825000960","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Aiming at the one-to-one pursuit-evasion problem in space, to successfully intercept the close-range evader with arbitrary counter-maneuver under relative motion between pursuer and evader at a close given range, this paper proposes a decision-making method for close-range pursuit-evasion interception based on Distributed Distributional Deep Determined Policy Gradient (D4PG). An improved nearest neighbor algorithm exploration mechanism including random constant and logarithmic constant is adopted, which reduces the learning burden of the algorithm and improves its convergence stability. A target network containing three value networks is constructed, and the loss function is calculated by selecting a value network with the minimum variance of probability distribution in the three networks, which enables the more accurate estimation of the Q-functions, and the operation speed and efficiency of the algorithm are effectively improved. Four typical escaping scenarios of arbitrary counter-maneuvering are performed as experimental verification to the simulation, and the results show the effectiveness and superiority of the proposed decision-making method for space proximity pursuit-evasion interception.

查看原文本刊更多论文

基于深度强化学习的空间接近追逃拦截决策研究

针对空间中一对一的追逃问题，为了在给定近距离内，在追逃相对运动条件下成功拦截具有任意反机动的近距离躲避机，提出了一种基于分布式深度确定策略梯度（D4PG）的近距离追逃拦截决策方法。采用改进的包含随机常数和对数常数的最近邻算法探索机制，减少了算法的学习负担，提高了算法的收敛稳定性。构造了包含三个值网络的目标网络，通过选择三个网络中概率分布方差最小的值网络来计算损失函数，使得q函数的估计更加准确，有效地提高了算法的运算速度和效率。通过四种典型的任意反机动逃离场景对仿真进行了实验验证，结果表明了所提决策方法在空间近距离追逃拦截中的有效性和优越性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Mechatronics 工程技术-工程：电子与电气

CiteScore

5.90

自引率

9.10%

发文量

审稿时长

109 days

期刊介绍： Mechatronics is the synergistic combination of precision mechanical engineering, electronic control and systems thinking in the design of products and manufacturing processes. It relates to the design of systems, devices and products aimed at achieving an optimal balance between basic mechanical structure and its overall control. The purpose of this journal is to provide rapid publication of topical papers featuring practical developments in mechatronics. It will cover a wide range of application areas including consumer product design, instrumentation, manufacturing methods, computer integration and process and device control, and will attract a readership from across the industrial and academic research spectrum. Particular importance will be attached to aspects of innovation in mechatronics design philosophy which illustrate the benefits obtainable by an a priori integration of functionality with embedded microprocessor control. A major item will be the design of machines, devices and systems possessing a degree of computer based intelligence. The journal seeks to publish research progress in this field with an emphasis on the applied rather than the theoretical. It will also serve the dual role of bringing greater recognition to this important area of engineering.