Obstacles Avoidance of Self-driving Vehicle using Deep Reinforcement Learning

2021 31st International Conference on Computer Theory and Applications (ICCTA) Pub Date : 2021-12-11 DOI:10.1109/ICCTA54562.2021.9916640

Mahmoud Osama Radwan, Ahmed Ahmed Hesham Sedky, K. Mahar

{"title":"Obstacles Avoidance of Self-driving Vehicle using Deep Reinforcement Learning","authors":"Mahmoud Osama Radwan, Ahmed Ahmed Hesham Sedky, K. Mahar","doi":"10.1109/ICCTA54562.2021.9916640","DOIUrl":null,"url":null,"abstract":"Nowadays, there exist different self-driving vehicle functions that allow the vehicle to perform certain actions by itself while the driver is only monitoring it. However, it is difficult in real world to acquire training data for self-driving artificial intelligence algorithms because there are a lot of risks and the need of labeled data. This paper proposes a method to collect training data from Unity game engine’s Machine Learning Toolkit (ML-Agents Toolkit). With this toolkit, Unity allows its users to incorporate Reinforcement Learning (RL) algorithms to train a learning agent. The aim of this paper is to search for the best RL algorithm in order to train the self-driving vehicle to avoid obstacles in a 3D environment. For all study cases, the learning was done by using the two RL learning algorithms Proximal Policy Optimization algorithm (PPO) and Soft Actor-Critic (SAC) algorithm, both using single-instance and multi-instance training. In the data collection from virtual environment to learn, two types of sensors in comparison had been experimented using camera sensors and Light Detection and Ranging (LiDaR) sensors. The results of the research show the advantages and limitations of the used learning algorithms for learning behaviors, the importance of the demonstration provided for the learning algorithms. Experimental results for applying the virtual driving data to drive a vehicle shows the effectiveness of the proposed methodology.","PeriodicalId":258950,"journal":{"name":"2021 31st International Conference on Computer Theory and Applications (ICCTA)","volume":"13 4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 31st International Conference on Computer Theory and Applications (ICCTA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCTA54562.2021.9916640","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Nowadays, there exist different self-driving vehicle functions that allow the vehicle to perform certain actions by itself while the driver is only monitoring it. However, it is difficult in real world to acquire training data for self-driving artificial intelligence algorithms because there are a lot of risks and the need of labeled data. This paper proposes a method to collect training data from Unity game engine’s Machine Learning Toolkit (ML-Agents Toolkit). With this toolkit, Unity allows its users to incorporate Reinforcement Learning (RL) algorithms to train a learning agent. The aim of this paper is to search for the best RL algorithm in order to train the self-driving vehicle to avoid obstacles in a 3D environment. For all study cases, the learning was done by using the two RL learning algorithms Proximal Policy Optimization algorithm (PPO) and Soft Actor-Critic (SAC) algorithm, both using single-instance and multi-instance training. In the data collection from virtual environment to learn, two types of sensors in comparison had been experimented using camera sensors and Light Detection and Ranging (LiDaR) sensors. The results of the research show the advantages and limitations of the used learning algorithms for learning behaviors, the importance of the demonstration provided for the learning algorithms. Experimental results for applying the virtual driving data to drive a vehicle shows the effectiveness of the proposed methodology.

查看原文本刊更多论文

基于深度强化学习的自动驾驶车辆避障

现在，有不同的自动驾驶汽车功能，允许车辆自行执行某些动作，而驾驶员只是监控它。然而，在现实世界中，自动驾驶人工智能算法的训练数据很难获取，因为存在很多风险，而且需要标记数据。本文提出了一种从Unity游戏引擎的机器学习工具包(ML-Agents Toolkit)中收集训练数据的方法。有了这个工具包，Unity允许其用户结合强化学习(RL)算法来训练学习代理。本文的目的是寻找最佳的强化学习算法，以训练自动驾驶车辆在三维环境中避开障碍物。对于所有的研究案例，学习是通过使用两种RL学习算法近端策略优化算法(PPO)和软行为者批评家(SAC)算法来完成的，这两种算法都使用单实例和多实例训练。在从虚拟环境中收集数据进行学习的过程中，对比实验了两种类型的传感器，分别是摄像头传感器和激光雷达(LiDaR)传感器。研究结果显示了学习行为中常用学习算法的优点和局限性，以及为学习算法提供论证的重要性。将虚拟驾驶数据应用于车辆驾驶的实验结果表明了该方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 31st International Conference on Computer Theory and Applications (ICCTA)

自引率

0.00%

发文量