基于深度强化学习的静态和动态环境四旋翼无地图导航

2021 3rd International Conference on Industrial Artificial Intelligence (IAI) Pub Date : 2021-11-08 DOI:10.1109/IAI53119.2021.9619200

Tsung-Hsi Tsai, Qing Li

{"title":"基于深度强化学习的静态和动态环境四旋翼无地图导航","authors":"Tsung-Hsi Tsai, Qing Li","doi":"10.1109/IAI53119.2021.9619200","DOIUrl":null,"url":null,"abstract":"In this paper, we propose a mapless autonomous navigation planner which plans a collision-free trajectory for quadrotor without any manual operations. Deep Reinforcement Learning (DRL) can optimize the policy by trial and error without knowing the prior information of the environment. The designed reward function has better convergence which compares to the benchmark method. The learned policy makes a real time collision free trajectory which can cope with the dynamic obstacles under different scenarios. The evaluation result shows that the trained model can be applied directly to the unknown environment without retraining the agent.","PeriodicalId":106675,"journal":{"name":"2021 3rd International Conference on Industrial Artificial Intelligence (IAI)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Quadrotor Mapless Navigation in Static and Dynamic Environments based on Deep Reinforcement Learning\",\"authors\":\"Tsung-Hsi Tsai, Qing Li\",\"doi\":\"10.1109/IAI53119.2021.9619200\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we propose a mapless autonomous navigation planner which plans a collision-free trajectory for quadrotor without any manual operations. Deep Reinforcement Learning (DRL) can optimize the policy by trial and error without knowing the prior information of the environment. The designed reward function has better convergence which compares to the benchmark method. The learned policy makes a real time collision free trajectory which can cope with the dynamic obstacles under different scenarios. The evaluation result shows that the trained model can be applied directly to the unknown environment without retraining the agent.\",\"PeriodicalId\":106675,\"journal\":{\"name\":\"2021 3rd International Conference on Industrial Artificial Intelligence (IAI)\",\"volume\":\"36 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-11-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 3rd International Conference on Industrial Artificial Intelligence (IAI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IAI53119.2021.9619200\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 3rd International Conference on Industrial Artificial Intelligence (IAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IAI53119.2021.9619200","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

在本文中，我们提出了一种无需人工操作的无地图自主导航规划器，用于规划四旋翼飞行器的无碰撞轨迹。深度强化学习(DRL)可以在不知道环境先验信息的情况下通过试错来优化策略。与基准方法相比，所设计的奖励函数具有更好的收敛性。学习到的策略可以在不同场景下生成实时无碰撞轨迹，以应对动态障碍物。评估结果表明，训练后的模型可以直接应用于未知环境，而无需对智能体进行再训练。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Quadrotor Mapless Navigation in Static and Dynamic Environments based on Deep Reinforcement Learning

In this paper, we propose a mapless autonomous navigation planner which plans a collision-free trajectory for quadrotor without any manual operations. Deep Reinforcement Learning (DRL) can optimize the policy by trial and error without knowing the prior information of the environment. The designed reward function has better convergence which compares to the benchmark method. The learned policy makes a real time collision free trajectory which can cope with the dynamic obstacles under different scenarios. The evaluation result shows that the trained model can be applied directly to the unknown environment without retraining the agent.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 3rd International Conference on Industrial Artificial Intelligence (IAI)

自引率

0.00%

发文量