{"title":"基于目标检测的自主点对点导航强化学习","authors":"Tyrell Lewis, Alexander Ibarra, M. Jamshidi","doi":"10.23919/WAC55640.2022.9934448","DOIUrl":null,"url":null,"abstract":"Autonomous navigation has been a fundamental area of research for real-world mobile robotic applications, having widespread utility across many industries from warehouse package delivery to residential cleaning services. Because of the complex nature of the robot’s environment, several challenges have prevented effectively implementing reinforcement learning-based algorithms trained in simulation. While difficulties can arise from the virtual environment lacking the sophistication to represent such a large and complex state space based on data-heavy sensor observations, the variance in MDP representations across related studies biases their fair comparison, performance, and repeatability. In this study, it is found that the design of the reward function used for training a vision-based mobile agent to perform collision-free point-goal navigation in simulation plays a significant role in overall performance. A novel approach is introduced where reward is also granted for successfully detecting a target object scaled according to prediction confidence. This strategy was found to significantly improve the point-goal navigation behavior compared to simpler reward function designs seen in similar related studies.","PeriodicalId":339737,"journal":{"name":"2022 World Automation Congress (WAC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Object Detection-Based Reinforcement Learning for Autonomous Point-to-Point Navigation\",\"authors\":\"Tyrell Lewis, Alexander Ibarra, M. Jamshidi\",\"doi\":\"10.23919/WAC55640.2022.9934448\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Autonomous navigation has been a fundamental area of research for real-world mobile robotic applications, having widespread utility across many industries from warehouse package delivery to residential cleaning services. Because of the complex nature of the robot’s environment, several challenges have prevented effectively implementing reinforcement learning-based algorithms trained in simulation. While difficulties can arise from the virtual environment lacking the sophistication to represent such a large and complex state space based on data-heavy sensor observations, the variance in MDP representations across related studies biases their fair comparison, performance, and repeatability. In this study, it is found that the design of the reward function used for training a vision-based mobile agent to perform collision-free point-goal navigation in simulation plays a significant role in overall performance. A novel approach is introduced where reward is also granted for successfully detecting a target object scaled according to prediction confidence. This strategy was found to significantly improve the point-goal navigation behavior compared to simpler reward function designs seen in similar related studies.\",\"PeriodicalId\":339737,\"journal\":{\"name\":\"2022 World Automation Congress (WAC)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 World Automation Congress (WAC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.23919/WAC55640.2022.9934448\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 World Automation Congress (WAC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/WAC55640.2022.9934448","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Object Detection-Based Reinforcement Learning for Autonomous Point-to-Point Navigation
Autonomous navigation has been a fundamental area of research for real-world mobile robotic applications, having widespread utility across many industries from warehouse package delivery to residential cleaning services. Because of the complex nature of the robot’s environment, several challenges have prevented effectively implementing reinforcement learning-based algorithms trained in simulation. While difficulties can arise from the virtual environment lacking the sophistication to represent such a large and complex state space based on data-heavy sensor observations, the variance in MDP representations across related studies biases their fair comparison, performance, and repeatability. In this study, it is found that the design of the reward function used for training a vision-based mobile agent to perform collision-free point-goal navigation in simulation plays a significant role in overall performance. A novel approach is introduced where reward is also granted for successfully detecting a target object scaled according to prediction confidence. This strategy was found to significantly improve the point-goal navigation behavior compared to simpler reward function designs seen in similar related studies.