基于强化学习的清扫机器人路径规划

Woohyeon Moon, Bumgeun Park, Sarvar Hussain Nengroo, Taeyoung Kim, Dongsoo Har
{"title":"基于强化学习的清扫机器人路径规划","authors":"Woohyeon Moon, Bumgeun Park, Sarvar Hussain Nengroo, Taeyoung Kim, Dongsoo Har","doi":"10.1109/ROSE56499.2022.9977430","DOIUrl":null,"url":null,"abstract":"Recently, as the demand for cleaning robots has steadily increased, therefore household electricity consumption is also increasing. To solve this electricity consumption issue, the problem of efficient path planning for cleaning robot has become important and many studies have been conducted. However, most of them are about moving along a simple path segment, not about the whole path to clean all places. As the emerging deep learning technique, reinforcement learning (RL) has been adopted for cleaning robot. However, the models for RL operate only in a specific cleaning environment, not the various cleaning environment. The problem is that the models have to retrain whenever the cleaning environment changes. To solve this problem, the proximal policy optimization (PPO) algorithm is combined with an efficient path planning that operates in various cleaning environments, using transfer learning (TL), detection nearest cleaned tile, reward shaping, and making elite set methods. The proposed method is validated with an ablation study and comparison with conventional methods such as random and zigzag. The experimental results demonstrate that the proposed method achieves improved training performance and increased convergence speed over the original PPO. And it also demonstrates that this proposed method is better performance than conventional methods (random, zigzag).","PeriodicalId":265529,"journal":{"name":"2022 IEEE International Symposium on Robotic and Sensors Environments (ROSE)","volume":"506 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Path Planning of Cleaning Robot with Reinforcement Learning\",\"authors\":\"Woohyeon Moon, Bumgeun Park, Sarvar Hussain Nengroo, Taeyoung Kim, Dongsoo Har\",\"doi\":\"10.1109/ROSE56499.2022.9977430\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recently, as the demand for cleaning robots has steadily increased, therefore household electricity consumption is also increasing. To solve this electricity consumption issue, the problem of efficient path planning for cleaning robot has become important and many studies have been conducted. However, most of them are about moving along a simple path segment, not about the whole path to clean all places. As the emerging deep learning technique, reinforcement learning (RL) has been adopted for cleaning robot. However, the models for RL operate only in a specific cleaning environment, not the various cleaning environment. The problem is that the models have to retrain whenever the cleaning environment changes. To solve this problem, the proximal policy optimization (PPO) algorithm is combined with an efficient path planning that operates in various cleaning environments, using transfer learning (TL), detection nearest cleaned tile, reward shaping, and making elite set methods. The proposed method is validated with an ablation study and comparison with conventional methods such as random and zigzag. The experimental results demonstrate that the proposed method achieves improved training performance and increased convergence speed over the original PPO. And it also demonstrates that this proposed method is better performance than conventional methods (random, zigzag).\",\"PeriodicalId\":265529,\"journal\":{\"name\":\"2022 IEEE International Symposium on Robotic and Sensors Environments (ROSE)\",\"volume\":\"506 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-08-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE International Symposium on Robotic and Sensors Environments (ROSE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ROSE56499.2022.9977430\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Symposium on Robotic and Sensors Environments (ROSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ROSE56499.2022.9977430","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

摘要

近年来,由于清洁机器人的需求稳步增加,因此家庭用电量也在不断增加。为了解决这一电力消耗问题,清洁机器人的高效路径规划问题变得非常重要,并进行了许多研究。然而,它们大多数都是关于沿着一个简单的路径段移动,而不是关于清理整个路径的所有地方。作为新兴的深度学习技术,强化学习(RL)已被应用于清洁机器人。然而,RL模型仅在特定的清洁环境中运行,而不是在各种清洁环境中运行。问题是,每当清洁环境发生变化时,模型都必须重新训练。为了解决这一问题,将最近策略优化(PPO)算法与在各种清洁环境中运行的有效路径规划相结合,使用迁移学习(TL)、检测最近清洁瓷砖、奖励塑造和制作精英集方法。通过烧蚀实验验证了该方法的有效性,并与传统的随机法和之字形法进行了比较。实验结果表明,该方法在训练性能和收敛速度上均优于原PPO算法。并证明了该方法比传统方法(随机、之字形)具有更好的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Path Planning of Cleaning Robot with Reinforcement Learning
Recently, as the demand for cleaning robots has steadily increased, therefore household electricity consumption is also increasing. To solve this electricity consumption issue, the problem of efficient path planning for cleaning robot has become important and many studies have been conducted. However, most of them are about moving along a simple path segment, not about the whole path to clean all places. As the emerging deep learning technique, reinforcement learning (RL) has been adopted for cleaning robot. However, the models for RL operate only in a specific cleaning environment, not the various cleaning environment. The problem is that the models have to retrain whenever the cleaning environment changes. To solve this problem, the proximal policy optimization (PPO) algorithm is combined with an efficient path planning that operates in various cleaning environments, using transfer learning (TL), detection nearest cleaned tile, reward shaping, and making elite set methods. The proposed method is validated with an ablation study and comparison with conventional methods such as random and zigzag. The experimental results demonstrate that the proposed method achieves improved training performance and increased convergence speed over the original PPO. And it also demonstrates that this proposed method is better performance than conventional methods (random, zigzag).
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信