Multirobot Coverage Path Planning Based on Deep Q-Network in Unknown Environment

J. Robotics Pub Date : 2022-08-03 DOI:10.1155/2022/6825902

Wenhao Li, Tao Zhao, S. Dian

{"title":"Multirobot Coverage Path Planning Based on Deep Q-Network in Unknown Environment","authors":"Wenhao Li, Tao Zhao, S. Dian","doi":"10.1155/2022/6825902","DOIUrl":null,"url":null,"abstract":"Aiming at the problems of security, high repetition rate, and many restrictions of multirobot coverage path planning (MCPP) in an unknown environment, Deep Q-Network (DQN) is selected as a part of the method in this paper after considering its powerful approximation ability to the optimal action value function. Then, a deduction method and some environments handling methods are proposed to improve the performance of the decision-making stage. The deduction method assumes the movement direction of each robot and counts the reward value obtained by the robots in this way and then determines the actual movement directions combined with DQN. For these reasons, the whole algorithm is divided into two parts: offline training and online decision-making. Online decision-making relies on the sliding-view method and probability statistics to deal with the nonstandard size and unknown environments and the deduction method to improve the efficiency of coverage. Simulation results show that the performance of the proposed online method is close to that of the offline algorithm which needs long time optimization, and the proposed method is more stable as well. Some performance defects of current MCPP methods in an unknown environment are ameliorated in this study.","PeriodicalId":186435,"journal":{"name":"J. Robotics","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"J. Robotics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1155/2022/6825902","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

Aiming at the problems of security, high repetition rate, and many restrictions of multirobot coverage path planning (MCPP) in an unknown environment, Deep Q-Network (DQN) is selected as a part of the method in this paper after considering its powerful approximation ability to the optimal action value function. Then, a deduction method and some environments handling methods are proposed to improve the performance of the decision-making stage. The deduction method assumes the movement direction of each robot and counts the reward value obtained by the robots in this way and then determines the actual movement directions combined with DQN. For these reasons, the whole algorithm is divided into two parts: offline training and online decision-making. Online decision-making relies on the sliding-view method and probability statistics to deal with the nonstandard size and unknown environments and the deduction method to improve the efficiency of coverage. Simulation results show that the performance of the proposed online method is close to that of the offline algorithm which needs long time optimization, and the proposed method is more stable as well. Some performance defects of current MCPP methods in an unknown environment are ameliorated in this study.

查看原文本刊更多论文

未知环境下基于深度q -网络的多机器人覆盖路径规划

针对未知环境下多机器人覆盖路径规划(MCPP)存在的安全性、高重复率和诸多限制等问题，考虑到深度q -网络(Deep Q-Network, DQN)对最优动作值函数的强大逼近能力，本文选择深度q -网络作为该方法的一部分。然后，提出了一种推理方法和一些环境处理方法，以提高决策阶段的性能。演绎法假设每个机器人的运动方向，并以此计算机器人获得的奖励值，然后结合DQN确定实际的运动方向。基于这些原因，整个算法分为离线训练和在线决策两部分。在线决策依赖于滑动视图方法和概率统计来处理非标准尺寸和未知环境，依赖于演绎方法来提高覆盖效率。仿真结果表明，该方法的性能接近需要长时间优化的离线算法，且稳定性更好。本文改进了现有MCPP方法在未知环境下的一些性能缺陷。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

J. Robotics

自引率

0.00%

发文量