Yan Gao, Jing Wu, Changyun Wei, Raphael Grech, Ze Ji
{"title":"Deep Reinforcement Learning for Localisability-Aware Mapless Navigation","authors":"Yan Gao, Jing Wu, Changyun Wei, Raphael Grech, Ze Ji","doi":"10.1049/csy2.70018","DOIUrl":null,"url":null,"abstract":"<p>Mapless navigation refers to the task of searching for a collision free path without relying on a pre-defined map. Most current works of mapless navigation assume accurate ground-truth localisation is available. However, this is not true, especially for indoor environments, where simultaneous localisation and mapping (SLAM) is needed for location estimation, which highly relies on the richness of environment features. In this work, we propose a novel deep reinforcement learning (DRL) based mapless navigation method without relying on the assumption of the availability of localisation. Our method utilises RGB-D based Oriented FAST and Rotated BRIEF (ORB) SLAM2 for robot localisation. Our policy effectively guides the robot's movement towards the target while enhancing robot pose estimation by considering the quality of the observed features along the selected paths. To facilitate policy training, we propose a compact state representation based on the spatial distributions of map points, which enhances the robot's awareness of areas with reliable map points. Furthermore, we suggest incorporating the relative pose error into the reward function. In this way, the policy will be more responsive to each single action. In addition, rather than utilising a pre-set threshold, we adopt a dynamic threshold to improve the policy's adaptability to variations in SLAM performance across different environments. The experiments in localisation challenging environments have demonstrated the remarkable performance of our proposed method. It outperforms the related DRL based methods in terms of success rate.</p>","PeriodicalId":34110,"journal":{"name":"IET Cybersystems and Robotics","volume":"7 1","pages":""},"PeriodicalIF":1.5000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/csy2.70018","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IET Cybersystems and Robotics","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/csy2.70018","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Mapless navigation refers to the task of searching for a collision free path without relying on a pre-defined map. Most current works of mapless navigation assume accurate ground-truth localisation is available. However, this is not true, especially for indoor environments, where simultaneous localisation and mapping (SLAM) is needed for location estimation, which highly relies on the richness of environment features. In this work, we propose a novel deep reinforcement learning (DRL) based mapless navigation method without relying on the assumption of the availability of localisation. Our method utilises RGB-D based Oriented FAST and Rotated BRIEF (ORB) SLAM2 for robot localisation. Our policy effectively guides the robot's movement towards the target while enhancing robot pose estimation by considering the quality of the observed features along the selected paths. To facilitate policy training, we propose a compact state representation based on the spatial distributions of map points, which enhances the robot's awareness of areas with reliable map points. Furthermore, we suggest incorporating the relative pose error into the reward function. In this way, the policy will be more responsive to each single action. In addition, rather than utilising a pre-set threshold, we adopt a dynamic threshold to improve the policy's adaptability to variations in SLAM performance across different environments. The experiments in localisation challenging environments have demonstrated the remarkable performance of our proposed method. It outperforms the related DRL based methods in terms of success rate.