Reinforcement Learning Method for Autonomous UAVs Monitoring an Uncertain Target

2021 Eighth International Conference on Social Network Analysis, Management and Security (SNAMS) Pub Date : 2021-12-06 DOI:10.1109/SNAMS53716.2021.9732147

Mohannad Al-Hefnawi, Ala’eddin Masadeh, H. Salameh, A. Musa

{"title":"Reinforcement Learning Method for Autonomous UAVs Monitoring an Uncertain Target","authors":"Mohannad Al-Hefnawi, Ala’eddin Masadeh, H. Salameh, A. Musa","doi":"10.1109/SNAMS53716.2021.9732147","DOIUrl":null,"url":null,"abstract":"Autonomous unmanned aerial vehicles are able to sense their surrounding environments, and fly safely with little or no human intervention. Autonomous unmanned aerial vehicles are characterized by their ability to make decisions based on predicting future possible situations and learning from previous experiences. In this paper, we aim at developing algorithms that enable unmanned aerial vehicles to monitor and detect a dynamic uncertain target autonomously. This work considers a real monitoring system consists of a mission area, an autonomous unmanned aerial vehicle, a charging station, and a dynamic uncertain target. The mission area consists of two main areas, which are the area where the charging station is placed and the area where the target moves. The target area is divided to a number of subareas. We also adopt a time slotted system that has M equal-duration slots. The unmanned aerial vehicle is equipped with a battery of finite energy that can be recharged from the charging station. It can fly from one subarea to another during one time slot. The target moves from one subarea to another according to an unknown Markov process. In this context, we propose to using reinforcement learning algorithms that enables autonomous unmanned aerial vehicles to learn the movement of a dynamic uncertain target autonomously. Simulation results show that reinforcement learning algorithms outperform the performance of random and circular algorithms.11This work was supported by the ASPIRE Award for Research Excellence Program 2020 (Abu Dhabi, UAE) under grant AARE20-161.","PeriodicalId":387260,"journal":{"name":"2021 Eighth International Conference on Social Network Analysis, Management and Security (SNAMS)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 Eighth International Conference on Social Network Analysis, Management and Security (SNAMS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SNAMS53716.2021.9732147","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

Autonomous unmanned aerial vehicles are able to sense their surrounding environments, and fly safely with little or no human intervention. Autonomous unmanned aerial vehicles are characterized by their ability to make decisions based on predicting future possible situations and learning from previous experiences. In this paper, we aim at developing algorithms that enable unmanned aerial vehicles to monitor and detect a dynamic uncertain target autonomously. This work considers a real monitoring system consists of a mission area, an autonomous unmanned aerial vehicle, a charging station, and a dynamic uncertain target. The mission area consists of two main areas, which are the area where the charging station is placed and the area where the target moves. The target area is divided to a number of subareas. We also adopt a time slotted system that has M equal-duration slots. The unmanned aerial vehicle is equipped with a battery of finite energy that can be recharged from the charging station. It can fly from one subarea to another during one time slot. The target moves from one subarea to another according to an unknown Markov process. In this context, we propose to using reinforcement learning algorithms that enables autonomous unmanned aerial vehicles to learn the movement of a dynamic uncertain target autonomously. Simulation results show that reinforcement learning algorithms outperform the performance of random and circular algorithms.11This work was supported by the ASPIRE Award for Research Excellence Program 2020 (Abu Dhabi, UAE) under grant AARE20-161.

查看原文本刊更多论文

自主无人机监测不确定目标的强化学习方法

自主无人驾驶飞行器能够感知周围环境，在很少或没有人为干预的情况下安全飞行。自主无人机的特点是能够根据预测未来可能出现的情况和从以前的经验中学习做出决策。在本文中，我们的目标是开发算法，使无人机能够自主监测和检测动态不确定目标。本文考虑了一个由任务区域、自主无人机、充电站和动态不确定目标组成的真实监控系统。任务区域主要包括两个区域，即充电站所在区域和目标移动区域。目标区域被划分为若干子区域。我们还采用了一个有M个等时隙的时隙系统。无人机配备了一个有限能量的电池，可以从充电站充电。它可以在一个时间段内从一个分区飞到另一个分区。目标根据未知马尔可夫过程从一个子区域移动到另一个子区域。在这种情况下，我们建议使用强化学习算法，使自主无人机能够自主学习动态不确定目标的运动。仿真结果表明，强化学习算法的性能优于随机算法和循环算法。11本研究由ASPIRE卓越研究计划2020(阿布扎比，阿联酋)资助，项目编号AARE20-161。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 Eighth International Conference on Social Network Analysis, Management and Security (SNAMS)

自引率

0.00%

发文量