基于深度强化学习的机动目标拦截制导

IF 1.1 4区 工程技术 Q3 ENGINEERING, AEROSPACE
Zhe Hu, Liang Xiao, Jun Guan, Wenjun Yi, Hongqiao Yin
{"title":"基于深度强化学习的机动目标拦截制导","authors":"Zhe Hu, Liang Xiao, Jun Guan, Wenjun Yi, Hongqiao Yin","doi":"10.1155/2023/7924190","DOIUrl":null,"url":null,"abstract":"In this paper, a novel guidance law based on a reinforcement learning (RL) algorithm is presented to deal with the maneuvering target interception problem using a deep deterministic policy gradient descent neural network. We take the missile’s line-of-sight (LOS) rate as the observation of the RL algorithm and propose a novel reward function, which is constructed with the miss distance and LOS rate to train the neural network off-line. In the guidance process, the trained neural network has the capacity of mapping the missile’s LOS rate to the normal acceleration of the missile directly, so as to generate guidance commands in real time. Under the actor-critic (AC) framework, we adopt the twin-delayed deep deterministic policy gradient (TD3) algorithm by taking the minimum value between a pair of critics to reduce overestimation. Simulation results show that the proposed TD3-based RL guidance law outperforms the current state of the RL guidance law, has better performance to cope with continuous action and state space, and also has a faster convergence speed and higher reward. Furthermore, the proposed RL guidance law has better accuracy and robustness when intercepting a maneuvering target, and the LOS rate is converged.","PeriodicalId":13748,"journal":{"name":"International Journal of Aerospace Engineering","volume":null,"pages":null},"PeriodicalIF":1.1000,"publicationDate":"2023-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Intercept Guidance of Maneuvering Targets with Deep Reinforcement Learning\",\"authors\":\"Zhe Hu, Liang Xiao, Jun Guan, Wenjun Yi, Hongqiao Yin\",\"doi\":\"10.1155/2023/7924190\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, a novel guidance law based on a reinforcement learning (RL) algorithm is presented to deal with the maneuvering target interception problem using a deep deterministic policy gradient descent neural network. We take the missile’s line-of-sight (LOS) rate as the observation of the RL algorithm and propose a novel reward function, which is constructed with the miss distance and LOS rate to train the neural network off-line. In the guidance process, the trained neural network has the capacity of mapping the missile’s LOS rate to the normal acceleration of the missile directly, so as to generate guidance commands in real time. Under the actor-critic (AC) framework, we adopt the twin-delayed deep deterministic policy gradient (TD3) algorithm by taking the minimum value between a pair of critics to reduce overestimation. Simulation results show that the proposed TD3-based RL guidance law outperforms the current state of the RL guidance law, has better performance to cope with continuous action and state space, and also has a faster convergence speed and higher reward. Furthermore, the proposed RL guidance law has better accuracy and robustness when intercepting a maneuvering target, and the LOS rate is converged.\",\"PeriodicalId\":13748,\"journal\":{\"name\":\"International Journal of Aerospace Engineering\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":1.1000,\"publicationDate\":\"2023-09-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Aerospace Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1155/2023/7924190\",\"RegionNum\":4,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ENGINEERING, AEROSPACE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Aerospace Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1155/2023/7924190","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, AEROSPACE","Score":null,"Total":0}
引用次数: 0

摘要

本文提出了一种基于强化学习(RL)算法的新型制导律,利用深度确定性策略梯度下降神经网络处理机动目标拦截问题。我们以导弹的视距(LOS)率作为RL算法的观测值,并提出了一种新的奖励函数,该函数由脱靶量和LOS率组成,用于离线训练神经网络。在制导过程中,训练后的神经网络具有将导弹的LOS速率直接映射到导弹法向加速度的能力,从而实时生成制导命令。在行动者-评论家(AC)框架下,我们采用双延迟深度确定性策略梯度(TD3)算法,通过取一对评论家之间的最小值来减少高估。仿真结果表明,提出的基于td3的RL制导律优于当前状态下的RL制导律,具有更好的连续动作和状态空间处理性能,收敛速度更快,奖励更高。此外,所提出的RL制导律在拦截机动目标时具有更好的精度和鲁棒性,并且LOS率是收敛的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Intercept Guidance of Maneuvering Targets with Deep Reinforcement Learning
In this paper, a novel guidance law based on a reinforcement learning (RL) algorithm is presented to deal with the maneuvering target interception problem using a deep deterministic policy gradient descent neural network. We take the missile’s line-of-sight (LOS) rate as the observation of the RL algorithm and propose a novel reward function, which is constructed with the miss distance and LOS rate to train the neural network off-line. In the guidance process, the trained neural network has the capacity of mapping the missile’s LOS rate to the normal acceleration of the missile directly, so as to generate guidance commands in real time. Under the actor-critic (AC) framework, we adopt the twin-delayed deep deterministic policy gradient (TD3) algorithm by taking the minimum value between a pair of critics to reduce overestimation. Simulation results show that the proposed TD3-based RL guidance law outperforms the current state of the RL guidance law, has better performance to cope with continuous action and state space, and also has a faster convergence speed and higher reward. Furthermore, the proposed RL guidance law has better accuracy and robustness when intercepting a maneuvering target, and the LOS rate is converged.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
2.70
自引率
7.10%
发文量
195
审稿时长
22 weeks
期刊介绍: International Journal of Aerospace Engineering aims to serve the international aerospace engineering community through dissemination of scientific knowledge on practical engineering and design methodologies pertaining to aircraft and space vehicles. Original unpublished manuscripts are solicited on all areas of aerospace engineering including but not limited to: -Mechanics of materials and structures- Aerodynamics and fluid mechanics- Dynamics and control- Aeroacoustics- Aeroelasticity- Propulsion and combustion- Avionics and systems- Flight simulation and mechanics- Unmanned air vehicles (UAVs). Review articles on any of the above topics are also welcome.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信