基于强化学习的清扫机器人路径规划

2022 IEEE International Symposium on Robotic and Sensors Environments (ROSE) Pub Date : 2022-08-17 DOI:10.1109/ROSE56499.2022.9977430

Woohyeon Moon, Bumgeun Park, Sarvar Hussain Nengroo, Taeyoung Kim, Dongsoo Har

{"title":"基于强化学习的清扫机器人路径规划","authors":"Woohyeon Moon, Bumgeun Park, Sarvar Hussain Nengroo, Taeyoung Kim, Dongsoo Har","doi":"10.1109/ROSE56499.2022.9977430","DOIUrl":null,"url":null,"abstract":"Recently, as the demand for cleaning robots has steadily increased, therefore household electricity consumption is also increasing. To solve this electricity consumption issue, the problem of efficient path planning for cleaning robot has become important and many studies have been conducted. However, most of them are about moving along a simple path segment, not about the whole path to clean all places. As the emerging deep learning technique, reinforcement learning (RL) has been adopted for cleaning robot. However, the models for RL operate only in a specific cleaning environment, not the various cleaning environment. The problem is that the models have to retrain whenever the cleaning environment changes. To solve this problem, the proximal policy optimization (PPO) algorithm is combined with an efficient path planning that operates in various cleaning environments, using transfer learning (TL), detection nearest cleaned tile, reward shaping, and making elite set methods. The proposed method is validated with an ablation study and comparison with conventional methods such as random and zigzag. The experimental results demonstrate that the proposed method achieves improved training performance and increased convergence speed over the original PPO. And it also demonstrates that this proposed method is better performance than conventional methods (random, zigzag).","PeriodicalId":265529,"journal":{"name":"2022 IEEE International Symposium on Robotic and Sensors Environments (ROSE)","volume":"506 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Path Planning of Cleaning Robot with Reinforcement Learning\",\"authors\":\"Woohyeon Moon, Bumgeun Park, Sarvar Hussain Nengroo, Taeyoung Kim, Dongsoo Har\",\"doi\":\"10.1109/ROSE56499.2022.9977430\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recently, as the demand for cleaning robots has steadily increased, therefore household electricity consumption is also increasing. To solve this electricity consumption issue, the problem of efficient path planning for cleaning robot has become important and many studies have been conducted. However, most of them are about moving along a simple path segment, not about the whole path to clean all places. As the emerging deep learning technique, reinforcement learning (RL) has been adopted for cleaning robot. However, the models for RL operate only in a specific cleaning environment, not the various cleaning environment. The problem is that the models have to retrain whenever the cleaning environment changes. To solve this problem, the proximal policy optimization (PPO) algorithm is combined with an efficient path planning that operates in various cleaning environments, using transfer learning (TL), detection nearest cleaned tile, reward shaping, and making elite set methods. The proposed method is validated with an ablation study and comparison with conventional methods such as random and zigzag. The experimental results demonstrate that the proposed method achieves improved training performance and increased convergence speed over the original PPO. And it also demonstrates that this proposed method is better performance than conventional methods (random, zigzag).\",\"PeriodicalId\":265529,\"journal\":{\"name\":\"2022 IEEE International Symposium on Robotic and Sensors Environments (ROSE)\",\"volume\":\"506 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-08-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE International Symposium on Robotic and Sensors Environments (ROSE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ROSE56499.2022.9977430\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Symposium on Robotic and Sensors Environments (ROSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ROSE56499.2022.9977430","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

摘要

近年来，由于清洁机器人的需求稳步增加，因此家庭用电量也在不断增加。为了解决这一电力消耗问题，清洁机器人的高效路径规划问题变得非常重要，并进行了许多研究。然而，它们大多数都是关于沿着一个简单的路径段移动，而不是关于清理整个路径的所有地方。作为新兴的深度学习技术，强化学习(RL)已被应用于清洁机器人。然而，RL模型仅在特定的清洁环境中运行，而不是在各种清洁环境中运行。问题是，每当清洁环境发生变化时，模型都必须重新训练。为了解决这一问题，将最近策略优化(PPO)算法与在各种清洁环境中运行的有效路径规划相结合，使用迁移学习(TL)、检测最近清洁瓷砖、奖励塑造和制作精英集方法。通过烧蚀实验验证了该方法的有效性，并与传统的随机法和之字形法进行了比较。实验结果表明，该方法在训练性能和收敛速度上均优于原PPO算法。并证明了该方法比传统方法(随机、之字形)具有更好的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Path Planning of Cleaning Robot with Reinforcement Learning

Recently, as the demand for cleaning robots has steadily increased, therefore household electricity consumption is also increasing. To solve this electricity consumption issue, the problem of efficient path planning for cleaning robot has become important and many studies have been conducted. However, most of them are about moving along a simple path segment, not about the whole path to clean all places. As the emerging deep learning technique, reinforcement learning (RL) has been adopted for cleaning robot. However, the models for RL operate only in a specific cleaning environment, not the various cleaning environment. The problem is that the models have to retrain whenever the cleaning environment changes. To solve this problem, the proximal policy optimization (PPO) algorithm is combined with an efficient path planning that operates in various cleaning environments, using transfer learning (TL), detection nearest cleaned tile, reward shaping, and making elite set methods. The proposed method is validated with an ablation study and comparison with conventional methods such as random and zigzag. The experimental results demonstrate that the proposed method achieves improved training performance and increased convergence speed over the original PPO. And it also demonstrates that this proposed method is better performance than conventional methods (random, zigzag).

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 IEEE International Symposium on Robotic and Sensors Environments (ROSE)

自引率

0.00%

发文量