增强无人机空中对接:结合离线和在线强化学习的混合方法

Drones Pub Date : 2024-04-24 DOI:10.3390/drones8050168
Yuting Feng, Tao Yang, Yushu Yu
{"title":"增强无人机空中对接:结合离线和在线强化学习的混合方法","authors":"Yuting Feng, Tao Yang, Yushu Yu","doi":"10.3390/drones8050168","DOIUrl":null,"url":null,"abstract":"In our study, we explore the task of performing docking maneuvers between two unmanned aerial vehicles (UAVs) using a combination of offline and online reinforcement learning (RL) methods. This task requires a UAV to accomplish external docking while maintaining stable flight control, representing two distinct types of objectives at the task execution level. Direct online RL training could lead to catastrophic forgetting, resulting in training failure. To overcome these challenges, we design a rule-based expert controller and accumulate an extensive dataset. Based on this, we concurrently design a series of rewards and train a guiding policy through offline RL. Then, we conduct comparative verification on different RL methods, ultimately selecting online RL to fine-tune the model trained offline. This strategy effectively combines the efficiency of offline RL with the exploratory capabilities of online RL. Our approach improves the success rate of the UAV’s aerial docking task, increasing it from 40% under the expert policy to 95%.","PeriodicalId":507567,"journal":{"name":"Drones","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhancing UAV Aerial Docking: A Hybrid Approach Combining Offline and Online Reinforcement Learning\",\"authors\":\"Yuting Feng, Tao Yang, Yushu Yu\",\"doi\":\"10.3390/drones8050168\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In our study, we explore the task of performing docking maneuvers between two unmanned aerial vehicles (UAVs) using a combination of offline and online reinforcement learning (RL) methods. This task requires a UAV to accomplish external docking while maintaining stable flight control, representing two distinct types of objectives at the task execution level. Direct online RL training could lead to catastrophic forgetting, resulting in training failure. To overcome these challenges, we design a rule-based expert controller and accumulate an extensive dataset. Based on this, we concurrently design a series of rewards and train a guiding policy through offline RL. Then, we conduct comparative verification on different RL methods, ultimately selecting online RL to fine-tune the model trained offline. This strategy effectively combines the efficiency of offline RL with the exploratory capabilities of online RL. Our approach improves the success rate of the UAV’s aerial docking task, increasing it from 40% under the expert policy to 95%.\",\"PeriodicalId\":507567,\"journal\":{\"name\":\"Drones\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-04-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Drones\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3390/drones8050168\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Drones","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/drones8050168","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

在我们的研究中,我们结合离线和在线强化学习(RL)方法,探讨了两个无人驾驶飞行器(UAV)之间执行对接操作的任务。这项任务要求无人飞行器在保持稳定飞行控制的同时完成外部对接,这代表了任务执行层面的两种不同类型的目标。直接在线 RL 训练可能会导致灾难性遗忘,从而导致训练失败。为了克服这些挑战,我们设计了基于规则的专家控制器,并积累了大量数据集。在此基础上,我们同时设计了一系列奖励,并通过离线 RL 训练指导策略。然后,我们对不同的 RL 方法进行比较验证,最终选择在线 RL 来微调离线训练的模型。这种策略有效地结合了离线 RL 的效率和在线 RL 的探索能力。我们的方法提高了无人机空中对接任务的成功率,从专家策略下的 40% 提高到 95%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Enhancing UAV Aerial Docking: A Hybrid Approach Combining Offline and Online Reinforcement Learning
In our study, we explore the task of performing docking maneuvers between two unmanned aerial vehicles (UAVs) using a combination of offline and online reinforcement learning (RL) methods. This task requires a UAV to accomplish external docking while maintaining stable flight control, representing two distinct types of objectives at the task execution level. Direct online RL training could lead to catastrophic forgetting, resulting in training failure. To overcome these challenges, we design a rule-based expert controller and accumulate an extensive dataset. Based on this, we concurrently design a series of rewards and train a guiding policy through offline RL. Then, we conduct comparative verification on different RL methods, ultimately selecting online RL to fine-tune the model trained offline. This strategy effectively combines the efficiency of offline RL with the exploratory capabilities of online RL. Our approach improves the success rate of the UAV’s aerial docking task, increasing it from 40% under the expert policy to 95%.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信