Intelligent maneuver decision-making for UAVs using the TD3-LSTM reinforcement learning algorithm under uncertain information.

IF 3 Q2 ROBOTICS

Frontiers in Robotics and AI Pub Date : 2025-08-01 eCollection Date: 2025-01-01 DOI:10.3389/frobt.2025.1645927

Tongle Zhou, Ziyi Liu, Wenxiao Jin, Zengliang Han

{"title":"Intelligent maneuver decision-making for UAVs using the TD3-LSTM reinforcement learning algorithm under uncertain information.","authors":"Tongle Zhou, Ziyi Liu, Wenxiao Jin, Zengliang Han","doi":"10.3389/frobt.2025.1645927","DOIUrl":null,"url":null,"abstract":"<p><p>Aiming to address the complexity and uncertainty of unmanned aerial vehicle (UAV) aerial confrontation, a twin delayed deep deterministic policy gradient (TD3)-long short-term memory (LSTM) reinforcement learning-based intelligent maneuver decision-making method is developed in this paper. A victory/defeat adjudication model is established, considering the operational capability of UAVs based on an aerial confrontation scenario and the 3-degree-of-freedom (3-DOF) UAV model. For the purpose of assisting UAVs in making maneuvering decisions in continuous action space, a model-driven state transition update mechanism is designed. The uncertainty is represented using the Wasserstein distance and memory nominal distribution methods to estimate the detection noise of the target. On the basis of TD3, an LSTM network is utilized to extract features from high-dimensional aerial confrontation situations with uncertainty. The effectiveness of the proposed method is verified by conducting four different aerial confrontation simulation experiments.</p>","PeriodicalId":47597,"journal":{"name":"Frontiers in Robotics and AI","volume":"12 ","pages":"1645927"},"PeriodicalIF":3.0000,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12355034/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Robotics and AI","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/frobt.2025.1645927","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"ROBOTICS","Score":null,"Total":0}

引用次数: 0

Abstract

Aiming to address the complexity and uncertainty of unmanned aerial vehicle (UAV) aerial confrontation, a twin delayed deep deterministic policy gradient (TD3)-long short-term memory (LSTM) reinforcement learning-based intelligent maneuver decision-making method is developed in this paper. A victory/defeat adjudication model is established, considering the operational capability of UAVs based on an aerial confrontation scenario and the 3-degree-of-freedom (3-DOF) UAV model. For the purpose of assisting UAVs in making maneuvering decisions in continuous action space, a model-driven state transition update mechanism is designed. The uncertainty is represented using the Wasserstein distance and memory nominal distribution methods to estimate the detection noise of the target. On the basis of TD3, an LSTM network is utilized to extract features from high-dimensional aerial confrontation situations with uncertainty. The effectiveness of the proposed method is verified by conducting four different aerial confrontation simulation experiments.

查看原文本刊更多论文

不确定信息下基于TD3-LSTM强化学习算法的无人机智能机动决策

针对无人机空中对抗的复杂性和不确定性，提出了一种基于双延迟深度确定性策略梯度（TD3）长短期记忆（LSTM）强化学习的智能机动决策方法。考虑无人机在空中对抗场景下的作战能力，建立了基于3自由度无人机模型的胜败判定模型。为了帮助无人机在连续动作空间中进行机动决策，设计了一种模型驱动的状态转移更新机制。利用Wasserstein距离和记忆标称分布方法来估计目标的检测噪声。在TD3的基础上，利用LSTM网络从具有不确定性的高维空中对抗情境中提取特征。通过四种不同的空中对抗仿真实验，验证了该方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Frontiers in Robotics and AI ROBOTICS-

CiteScore

6.50

自引率

5.90%

发文量

355

审稿时长

14 weeks

期刊介绍： Frontiers in Robotics and AI publishes rigorously peer-reviewed research covering all theory and applications of robotics, technology, and artificial intelligence, from biomedical to space robotics.