{"title":"基于深度强化学习的不确定动态环境下自主移动机器人导航","authors":"Zhangfan Lu, Ran Huang","doi":"10.1109/RCAR52367.2021.9517635","DOIUrl":null,"url":null,"abstract":"In this paper, we study autonomous end-to-end navigation for wheeled robots based on deep reinforcement learning (DRL) in an unknown environment without a priori map. The DRL network is mainly based on deep deterministic policy gradient algorithm together with long short-term memory. The input for the network is the data from a 2D lidar as well as the relative position to the target point, while the outputs are the linear velocity and angular velocity that actuate the robot. A novel reward function is proposed to avoid the collision with dynamic obstacles and to generate a smooth trajectory for the robot. The network is trained without supervision in an unknown dynamic environment, the random Gaussian noise is added to the input data of long short-term memory to avoid local optimum. Besides, different unstructured environments are also considered in the training to increase the robustness of the developed network. Experiments performed on public dataset have showed that the developed network makes the robot navigate in unstructured environments safely and outperform several DRL methods.","PeriodicalId":232892,"journal":{"name":"2021 IEEE International Conference on Real-time Computing and Robotics (RCAR)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Autonomous mobile robot navigation in uncertain dynamic environments based on deep reinforcement learning\",\"authors\":\"Zhangfan Lu, Ran Huang\",\"doi\":\"10.1109/RCAR52367.2021.9517635\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we study autonomous end-to-end navigation for wheeled robots based on deep reinforcement learning (DRL) in an unknown environment without a priori map. The DRL network is mainly based on deep deterministic policy gradient algorithm together with long short-term memory. The input for the network is the data from a 2D lidar as well as the relative position to the target point, while the outputs are the linear velocity and angular velocity that actuate the robot. A novel reward function is proposed to avoid the collision with dynamic obstacles and to generate a smooth trajectory for the robot. The network is trained without supervision in an unknown dynamic environment, the random Gaussian noise is added to the input data of long short-term memory to avoid local optimum. Besides, different unstructured environments are also considered in the training to increase the robustness of the developed network. Experiments performed on public dataset have showed that the developed network makes the robot navigate in unstructured environments safely and outperform several DRL methods.\",\"PeriodicalId\":232892,\"journal\":{\"name\":\"2021 IEEE International Conference on Real-time Computing and Robotics (RCAR)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-07-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE International Conference on Real-time Computing and Robotics (RCAR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/RCAR52367.2021.9517635\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Real-time Computing and Robotics (RCAR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RCAR52367.2021.9517635","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Autonomous mobile robot navigation in uncertain dynamic environments based on deep reinforcement learning
In this paper, we study autonomous end-to-end navigation for wheeled robots based on deep reinforcement learning (DRL) in an unknown environment without a priori map. The DRL network is mainly based on deep deterministic policy gradient algorithm together with long short-term memory. The input for the network is the data from a 2D lidar as well as the relative position to the target point, while the outputs are the linear velocity and angular velocity that actuate the robot. A novel reward function is proposed to avoid the collision with dynamic obstacles and to generate a smooth trajectory for the robot. The network is trained without supervision in an unknown dynamic environment, the random Gaussian noise is added to the input data of long short-term memory to avoid local optimum. Besides, different unstructured environments are also considered in the training to increase the robustness of the developed network. Experiments performed on public dataset have showed that the developed network makes the robot navigate in unstructured environments safely and outperform several DRL methods.