利用深度强化学习进行无人机导航和目标拦截:级联奖励方法

Ali A. Darwish;Arie Nakhmani
{"title":"利用深度强化学习进行无人机导航和目标拦截:级联奖励方法","authors":"Ali A. Darwish;Arie Nakhmani","doi":"10.1109/JISPIN.2023.3334690","DOIUrl":null,"url":null,"abstract":"This article proposes an architecture for drone navigation and target interception, utilizing a self-supervised, model-free deep reinforcement learning approach. Unlike the traditional methods relying on complex controllers, our approach uses deep reinforcement learning with cascade rewards, enabling a single drone to navigate obstacles and intercept targets using only a forward-facing depth–RGB camera. This research has significant implications for robotics, as it demonstrates how complex tasks can be tackled using deep reinforcement learning. Our work encompasses three key contributions. First, we tackle the challenge of partial observability when employing nonlinear function approximators for learning stochastic policies. Second, we optimize the task of maximizing the overall expected reward. Finally, we develop a software library for training drones to track and intercept targets. Through our experiments, we demonstrated that our approach, incorporating cascade reward, outperforms state-of-the-art deep \n<italic>Q</i>\n-network algorithms in terms of learning policies. By leveraging our methodology, drones can successfully navigate complex indoor and outdoor environments and effectively intercept targets based on visual cues.","PeriodicalId":100621,"journal":{"name":"IEEE Journal of Indoor and Seamless Positioning and Navigation","volume":"1 ","pages":"130-140"},"PeriodicalIF":0.0000,"publicationDate":"2023-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10323488","citationCount":"0","resultStr":"{\"title\":\"Drone Navigation and Target Interception Using Deep Reinforcement Learning: A Cascade Reward Approach\",\"authors\":\"Ali A. Darwish;Arie Nakhmani\",\"doi\":\"10.1109/JISPIN.2023.3334690\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This article proposes an architecture for drone navigation and target interception, utilizing a self-supervised, model-free deep reinforcement learning approach. Unlike the traditional methods relying on complex controllers, our approach uses deep reinforcement learning with cascade rewards, enabling a single drone to navigate obstacles and intercept targets using only a forward-facing depth–RGB camera. This research has significant implications for robotics, as it demonstrates how complex tasks can be tackled using deep reinforcement learning. Our work encompasses three key contributions. First, we tackle the challenge of partial observability when employing nonlinear function approximators for learning stochastic policies. Second, we optimize the task of maximizing the overall expected reward. Finally, we develop a software library for training drones to track and intercept targets. Through our experiments, we demonstrated that our approach, incorporating cascade reward, outperforms state-of-the-art deep \\n<italic>Q</i>\\n-network algorithms in terms of learning policies. By leveraging our methodology, drones can successfully navigate complex indoor and outdoor environments and effectively intercept targets based on visual cues.\",\"PeriodicalId\":100621,\"journal\":{\"name\":\"IEEE Journal of Indoor and Seamless Positioning and Navigation\",\"volume\":\"1 \",\"pages\":\"130-140\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-11-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10323488\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Journal of Indoor and Seamless Positioning and Navigation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10323488/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Journal of Indoor and Seamless Positioning and Navigation","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10323488/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

本文提出了一种无人机导航和目标拦截的架构,利用自监督、无模型的深度强化学习方法。与依赖复杂控制器的传统方法不同,我们的方法使用具有级联奖励的深度强化学习,使单个无人机仅使用前向深度rgb相机即可导航障碍物并拦截目标。这项研究对机器人技术具有重要意义,因为它展示了如何使用深度强化学习来解决复杂的任务。我们的工作包括三个关键贡献。首先,我们在使用非线性函数逼近器学习随机策略时解决了部分可观察性的挑战。其次,我们优化了最大化总体预期奖励的任务。最后,我们开发了一个用于训练无人机跟踪和拦截目标的软件库。通过我们的实验,我们证明了我们的方法,结合级联奖励,在学习策略方面优于最先进的深度q -网络算法。利用我们的方法,无人机可以成功导航复杂的室内和室外环境,并根据视觉线索有效拦截目标。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Drone Navigation and Target Interception Using Deep Reinforcement Learning: A Cascade Reward Approach
This article proposes an architecture for drone navigation and target interception, utilizing a self-supervised, model-free deep reinforcement learning approach. Unlike the traditional methods relying on complex controllers, our approach uses deep reinforcement learning with cascade rewards, enabling a single drone to navigate obstacles and intercept targets using only a forward-facing depth–RGB camera. This research has significant implications for robotics, as it demonstrates how complex tasks can be tackled using deep reinforcement learning. Our work encompasses three key contributions. First, we tackle the challenge of partial observability when employing nonlinear function approximators for learning stochastic policies. Second, we optimize the task of maximizing the overall expected reward. Finally, we develop a software library for training drones to track and intercept targets. Through our experiments, we demonstrated that our approach, incorporating cascade reward, outperforms state-of-the-art deep Q -network algorithms in terms of learning policies. By leveraging our methodology, drones can successfully navigate complex indoor and outdoor environments and effectively intercept targets based on visual cues.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信