利用深度强化学习进行无人机导航和目标拦截：级联奖励方法

IEEE Journal of Indoor and Seamless Positioning and Navigation Pub Date : 2023-11-20 DOI:10.1109/JISPIN.2023.3334690

Ali A. Darwish;Arie Nakhmani

{"title":"利用深度强化学习进行无人机导航和目标拦截：级联奖励方法","authors":"Ali A. Darwish;Arie Nakhmani","doi":"10.1109/JISPIN.2023.3334690","DOIUrl":null,"url":null,"abstract":"This article proposes an architecture for drone navigation and target interception, utilizing a self-supervised, model-free deep reinforcement learning approach. Unlike the traditional methods relying on complex controllers, our approach uses deep reinforcement learning with cascade rewards, enabling a single drone to navigate obstacles and intercept targets using only a forward-facing depth–RGB camera. This research has significant implications for robotics, as it demonstrates how complex tasks can be tackled using deep reinforcement learning. Our work encompasses three key contributions. First, we tackle the challenge of partial observability when employing nonlinear function approximators for learning stochastic policies. Second, we optimize the task of maximizing the overall expected reward. Finally, we develop a software library for training drones to track and intercept targets. Through our experiments, we demonstrated that our approach, incorporating cascade reward, outperforms state-of-the-art deep \n<italic>Q</i>\n-network algorithms in terms of learning policies. By leveraging our methodology, drones can successfully navigate complex indoor and outdoor environments and effectively intercept targets based on visual cues.","PeriodicalId":100621,"journal":{"name":"IEEE Journal of Indoor and Seamless Positioning and Navigation","volume":"1 ","pages":"130-140"},"PeriodicalIF":0.0000,"publicationDate":"2023-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10323488","citationCount":"0","resultStr":"{\"title\":\"Drone Navigation and Target Interception Using Deep Reinforcement Learning: A Cascade Reward Approach\",\"authors\":\"Ali A. Darwish;Arie Nakhmani\",\"doi\":\"10.1109/JISPIN.2023.3334690\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This article proposes an architecture for drone navigation and target interception, utilizing a self-supervised, model-free deep reinforcement learning approach. Unlike the traditional methods relying on complex controllers, our approach uses deep reinforcement learning with cascade rewards, enabling a single drone to navigate obstacles and intercept targets using only a forward-facing depth–RGB camera. This research has significant implications for robotics, as it demonstrates how complex tasks can be tackled using deep reinforcement learning. Our work encompasses three key contributions. First, we tackle the challenge of partial observability when employing nonlinear function approximators for learning stochastic policies. Second, we optimize the task of maximizing the overall expected reward. Finally, we develop a software library for training drones to track and intercept targets. Through our experiments, we demonstrated that our approach, incorporating cascade reward, outperforms state-of-the-art deep \\n<italic>Q</i>\\n-network algorithms in terms of learning policies. By leveraging our methodology, drones can successfully navigate complex indoor and outdoor environments and effectively intercept targets based on visual cues.\",\"PeriodicalId\":100621,\"journal\":{\"name\":\"IEEE Journal of Indoor and Seamless Positioning and Navigation\",\"volume\":\"1 \",\"pages\":\"130-140\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-11-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10323488\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Journal of Indoor and Seamless Positioning and Navigation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10323488/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Journal of Indoor and Seamless Positioning and Navigation","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10323488/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

本文提出了一种无人机导航和目标拦截的架构，利用自监督、无模型的深度强化学习方法。与依赖复杂控制器的传统方法不同，我们的方法使用具有级联奖励的深度强化学习，使单个无人机仅使用前向深度rgb相机即可导航障碍物并拦截目标。这项研究对机器人技术具有重要意义，因为它展示了如何使用深度强化学习来解决复杂的任务。我们的工作包括三个关键贡献。首先，我们在使用非线性函数逼近器学习随机策略时解决了部分可观察性的挑战。其次，我们优化了最大化总体预期奖励的任务。最后，我们开发了一个用于训练无人机跟踪和拦截目标的软件库。通过我们的实验，我们证明了我们的方法，结合级联奖励，在学习策略方面优于最先进的深度q -网络算法。利用我们的方法，无人机可以成功导航复杂的室内和室外环境，并根据视觉线索有效拦截目标。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Drone Navigation and Target Interception Using Deep Reinforcement Learning: A Cascade Reward Approach

This article proposes an architecture for drone navigation and target interception, utilizing a self-supervised, model-free deep reinforcement learning approach. Unlike the traditional methods relying on complex controllers, our approach uses deep reinforcement learning with cascade rewards, enabling a single drone to navigate obstacles and intercept targets using only a forward-facing depth–RGB camera. This research has significant implications for robotics, as it demonstrates how complex tasks can be tackled using deep reinforcement learning. Our work encompasses three key contributions. First, we tackle the challenge of partial observability when employing nonlinear function approximators for learning stochastic policies. Second, we optimize the task of maximizing the overall expected reward. Finally, we develop a software library for training drones to track and intercept targets. Through our experiments, we demonstrated that our approach, incorporating cascade reward, outperforms state-of-the-art deep Q -network algorithms in terms of learning policies. By leveraging our methodology, drones can successfully navigate complex indoor and outdoor environments and effectively intercept targets based on visual cues.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Journal of Indoor and Seamless Positioning and Navigation

自引率

0.00%

发文量