Jiaqi Wang, Huiyan Han, Xie Han, Liqun Kuang, Xiaowen Yang
{"title":"Reinforcement learning path planning method incorporating multi-step Hindsight Experience Replay for lightweight robots","authors":"Jiaqi Wang, Huiyan Han, Xie Han, Liqun Kuang, Xiaowen Yang","doi":"10.1016/j.displa.2024.102796","DOIUrl":null,"url":null,"abstract":"<div><p>Home service robots prioritize cost-effectiveness and convenience over the precision required for industrial tasks like autonomous driving, making their task execution more easily. Meanwhile, path planning tasks using Deep Reinforcement Learning(DRL) are commonly sparse reward problems with limited data utilization, posing challenges in obtaining meaningful rewards during training, consequently resulting in slow or challenging training. In response to these challenges, our paper introduces a lightweight end-to-end path planning algorithm employing with hindsight experience replay(HER). Initially, we optimize the reinforcement learning training process from scratch and map the complex high-dimensional action space and state space to the representative low-dimensional action space. At the same time, we improve the network structure to decouple the model navigation and obstacle avoidance module to meet the requirements of lightweight. Subsequently, we integrate HER and curriculum learning (CL) to tackle issues related to inefficient training. Additionally, we propose a multi-step hindsight experience replay (MS-HER) specifically for the path planning task, markedly enhancing both training efficiency and model generalization across diverse environments. To substantiate the enhanced training efficiency of the refined algorithm, we conducted tests within diverse Gazebo simulation environments. Results of the experiments reveal noteworthy enhancements in critical metrics, including success rate and training efficiency. To further ascertain the enhanced algorithm’s generalization capability, we evaluate its performance in some ”never-before-seen” simulation environment. Ultimately, we deploy the trained model onto a real lightweight robot for validation. The experimental outcomes indicate the model’s competence in successfully executing the path planning task, even on a small robot with constrained computational resources.</p></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"84 ","pages":"Article 102796"},"PeriodicalIF":3.7000,"publicationDate":"2024-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Displays","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0141938224001604","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
Home service robots prioritize cost-effectiveness and convenience over the precision required for industrial tasks like autonomous driving, making their task execution more easily. Meanwhile, path planning tasks using Deep Reinforcement Learning(DRL) are commonly sparse reward problems with limited data utilization, posing challenges in obtaining meaningful rewards during training, consequently resulting in slow or challenging training. In response to these challenges, our paper introduces a lightweight end-to-end path planning algorithm employing with hindsight experience replay(HER). Initially, we optimize the reinforcement learning training process from scratch and map the complex high-dimensional action space and state space to the representative low-dimensional action space. At the same time, we improve the network structure to decouple the model navigation and obstacle avoidance module to meet the requirements of lightweight. Subsequently, we integrate HER and curriculum learning (CL) to tackle issues related to inefficient training. Additionally, we propose a multi-step hindsight experience replay (MS-HER) specifically for the path planning task, markedly enhancing both training efficiency and model generalization across diverse environments. To substantiate the enhanced training efficiency of the refined algorithm, we conducted tests within diverse Gazebo simulation environments. Results of the experiments reveal noteworthy enhancements in critical metrics, including success rate and training efficiency. To further ascertain the enhanced algorithm’s generalization capability, we evaluate its performance in some ”never-before-seen” simulation environment. Ultimately, we deploy the trained model onto a real lightweight robot for validation. The experimental outcomes indicate the model’s competence in successfully executing the path planning task, even on a small robot with constrained computational resources.
期刊介绍:
Displays is the international journal covering the research and development of display technology, its effective presentation and perception of information, and applications and systems including display-human interface.
Technical papers on practical developments in Displays technology provide an effective channel to promote greater understanding and cross-fertilization across the diverse disciplines of the Displays community. Original research papers solving ergonomics issues at the display-human interface advance effective presentation of information. Tutorial papers covering fundamentals intended for display technologies and human factor engineers new to the field will also occasionally featured.