无人机自主导航、测绘和目标检测的强化学习

2020 IEEE/ION Position, Location and Navigation Symposium (PLANS) Pub Date : 2020-04-01 DOI:10.1109/PLANS46316.2020.9110163

Anna Guerra, Francesco Guidi, D. Dardari, P. Djurić

{"title":"无人机自主导航、测绘和目标检测的强化学习","authors":"Anna Guerra, Francesco Guidi, D. Dardari, P. Djurić","doi":"10.1109/PLANS46316.2020.9110163","DOIUrl":null,"url":null,"abstract":"In this paper, we study a joint detection, mapping and navigation problem for a single unmanned aerial vehicle (UAV) equipped with a low complexity radar and flying in an unknown environment. The goal is to optimize its trajectory with the purpose of maximizing the mapping accuracy and, at the same time, to avoid areas where measurements might not be sufficiently informative from the perspective of a target detection. This problem is formulated as a Markov decision process (MDP) where the UAV is an agent that runs either a state estimator for target detection and for environment mapping, and a reinforcement learning (RL) algorithm to infer its own policy of navigation (i.e., the control law). Numerical results show the feasibility of the proposed idea, highlighting the UAV's capability of autonomously exploring areas with high probability of target detection while reconstructing the surrounding environment.","PeriodicalId":273568,"journal":{"name":"2020 IEEE/ION Position, Location and Navigation Symposium (PLANS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"Reinforcement Learning for UAV Autonomous Navigation, Mapping and Target Detection\",\"authors\":\"Anna Guerra, Francesco Guidi, D. Dardari, P. Djurić\",\"doi\":\"10.1109/PLANS46316.2020.9110163\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we study a joint detection, mapping and navigation problem for a single unmanned aerial vehicle (UAV) equipped with a low complexity radar and flying in an unknown environment. The goal is to optimize its trajectory with the purpose of maximizing the mapping accuracy and, at the same time, to avoid areas where measurements might not be sufficiently informative from the perspective of a target detection. This problem is formulated as a Markov decision process (MDP) where the UAV is an agent that runs either a state estimator for target detection and for environment mapping, and a reinforcement learning (RL) algorithm to infer its own policy of navigation (i.e., the control law). Numerical results show the feasibility of the proposed idea, highlighting the UAV's capability of autonomously exploring areas with high probability of target detection while reconstructing the surrounding environment.\",\"PeriodicalId\":273568,\"journal\":{\"name\":\"2020 IEEE/ION Position, Location and Navigation Symposium (PLANS)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE/ION Position, Location and Navigation Symposium (PLANS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PLANS46316.2020.9110163\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE/ION Position, Location and Navigation Symposium (PLANS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PLANS46316.2020.9110163","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 11

摘要

本文研究了一种基于低复杂度雷达的单架无人机在未知环境中飞行的联合探测、测绘和导航问题。目标是优化其轨迹，以最大限度地提高测绘精度，同时避免从目标检测的角度来看测量结果可能没有足够信息的区域。这个问题被表述为一个马尔可夫决策过程(MDP)，其中无人机是一个代理，运行一个用于目标检测和环境映射的状态估计器，以及一个用于推断其自己的导航策略(即控制律)的强化学习(RL)算法。数值计算结果表明了该方法的可行性，突出了无人机在重建周围环境的同时自主探测高概率目标区域的能力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Reinforcement Learning for UAV Autonomous Navigation, Mapping and Target Detection

In this paper, we study a joint detection, mapping and navigation problem for a single unmanned aerial vehicle (UAV) equipped with a low complexity radar and flying in an unknown environment. The goal is to optimize its trajectory with the purpose of maximizing the mapping accuracy and, at the same time, to avoid areas where measurements might not be sufficiently informative from the perspective of a target detection. This problem is formulated as a Markov decision process (MDP) where the UAV is an agent that runs either a state estimator for target detection and for environment mapping, and a reinforcement learning (RL) algorithm to infer its own policy of navigation (i.e., the control law). Numerical results show the feasibility of the proposed idea, highlighting the UAV's capability of autonomously exploring areas with high probability of target detection while reconstructing the surrounding environment.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 IEEE/ION Position, Location and Navigation Symposium (PLANS)

自引率

0.00%

发文量