UAV Maneuvering Decision-Making Algorithm Based on Deep Reinforcement Learning Under the Guidance of Expert Experience

IF 1.9 3区 计算机科学 Q3 AUTOMATION & CONTROL SYSTEMS
Guang Zhan, Kun Zhang, Ke Li, Haiyin Piao
{"title":"UAV Maneuvering Decision-Making Algorithm Based on Deep Reinforcement Learning Under the Guidance of Expert Experience","authors":"Guang Zhan, Kun Zhang, Ke Li, Haiyin Piao","doi":"10.23919/jsee.2024.000022","DOIUrl":null,"url":null,"abstract":"Autonomous umanned aerial vehicle (UAV) manipulation is necessary for the defense department to execute tactical missions given by commanders in the future unmanned battle-field. A large amount of research has been devoted to improving the autonomous decision-making ability of UAV in an interactive environment, where finding the optimal maneuvering decision-making policy became one of the key issues for enabling the intelligence of UAV. In this paper, we propose a maneuvering decision-making algorithm for autonomous air-delivery based on deep reinforcement learning under the guidance of expert experience. Specifically, we refine the guidance towards area and guidance towards specific point tasks for the air-delivery process based on the traditional air-to-surface fire control methods. Moreover, we construct the UAV maneuvering decision-making model based on Markov decision processes (MDPs). Specifically, we present a reward shaping method for the guidance towards area and guidance towards specific point tasks using potential-based function and expert-guided advice. The proposed algorithm could accelerate the convergence of the maneuvering decision-making policy and increase the stability of the policy in terms of the output during the later stage of training process. The effectiveness of the proposed maneuvering decision-making policy is illustrated by the curves of training parameters and extensive experimental results for testing the trained policy.","PeriodicalId":50030,"journal":{"name":"Journal of Systems Engineering and Electronics","volume":"51 1","pages":""},"PeriodicalIF":1.9000,"publicationDate":"2024-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Systems Engineering and Electronics","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.23919/jsee.2024.000022","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Autonomous umanned aerial vehicle (UAV) manipulation is necessary for the defense department to execute tactical missions given by commanders in the future unmanned battle-field. A large amount of research has been devoted to improving the autonomous decision-making ability of UAV in an interactive environment, where finding the optimal maneuvering decision-making policy became one of the key issues for enabling the intelligence of UAV. In this paper, we propose a maneuvering decision-making algorithm for autonomous air-delivery based on deep reinforcement learning under the guidance of expert experience. Specifically, we refine the guidance towards area and guidance towards specific point tasks for the air-delivery process based on the traditional air-to-surface fire control methods. Moreover, we construct the UAV maneuvering decision-making model based on Markov decision processes (MDPs). Specifically, we present a reward shaping method for the guidance towards area and guidance towards specific point tasks using potential-based function and expert-guided advice. The proposed algorithm could accelerate the convergence of the maneuvering decision-making policy and increase the stability of the policy in terms of the output during the later stage of training process. The effectiveness of the proposed maneuvering decision-making policy is illustrated by the curves of training parameters and extensive experimental results for testing the trained policy.
基于专家经验指导下深度强化学习的无人机操纵决策算法
在未来的无人战场上,要执行指挥官下达的战术任务,国防部门必须实现无人飞行器(UAV)的自主操控。为提高无人飞行器在交互环境下的自主决策能力,人们进行了大量的研究,其中寻找最优操纵决策策略成为实现无人飞行器智能化的关键问题之一。本文提出了一种在专家经验指导下基于深度强化学习的自主空投机动决策算法。具体来说,我们在传统空对地火力控制方法的基础上,细化了空投过程中的区域引导和特定点引导任务。此外,我们还基于马尔可夫决策过程(MDP)构建了无人机机动决策模型。具体而言,我们提出了一种奖励塑造方法,利用基于潜能的函数和专家指导建议,实现对区域的引导和对特定点任务的引导。所提出的算法可以加快机动决策策略的收敛速度,并在后期训练过程中提高策略输出的稳定性。通过训练参数曲线和测试训练策略的大量实验结果,说明了所提出的机动决策策略的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Systems Engineering and Electronics
Journal of Systems Engineering and Electronics 工程技术-工程:电子与电气
CiteScore
4.10
自引率
14.30%
发文量
131
审稿时长
7.5 months
期刊介绍: Information not localized
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信