结合强化学习和模仿学习的分层自动驾驶框架

2021 International Conference on Computer Engineering and Application (ICCEA) Pub Date : 2021-06-01 DOI:10.1109/ICCEA53728.2021.00084

Zeyu Li

{"title":"结合强化学习和模仿学习的分层自动驾驶框架","authors":"Zeyu Li","doi":"10.1109/ICCEA53728.2021.00084","DOIUrl":null,"url":null,"abstract":"Autonomous driving technology aims to make driving decisions based on information about the vehicle’s environment. Navigation-based autonomous driving in urban scenarios has more complex scenarios than in relatively simple scenarios such as highways and parking lots, and is a task that still needs to be explored over time. Imitation learning models based on supervised learning methods are limited by the amount of expert data collected. Models based on reinforcement learning methods are able to interact with the environment, but are data inefficient and require a lot of exploration to learn effective policy. We propose a method that combines imitation learning with reinforcement learning enabling agent to achieve a higher success rate in urban autonomous driving navigation scenarios. To solve the problem of inefficient reinforcement learning data, our method decomposes the action space into low-level action space and high-level actin space, where low-level action space is multiple pre-trained imitation learning action space is a combination of several pre-trained imitation learning action spaces based on different control signals (i.e., follow, straight, turn right, turn left). High-level action space includes different control signals, the agent executes a specific imitation learning policy by selecting control signals from the high-level action space through a DQN-based reinforcement learning approach. Moreover, we propose a new reward for high level action selection. Experiments on the CARLA driving benchmark demonstrate that our approach outperforms both imitation learning methods and reinforcement learning methods on a variety of navigation-based driving tasks.","PeriodicalId":325790,"journal":{"name":"2021 International Conference on Computer Engineering and Application (ICCEA)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"A Hierarchical Autonomous Driving Framework Combining Reinforcement Learning and Imitation Learning\",\"authors\":\"Zeyu Li\",\"doi\":\"10.1109/ICCEA53728.2021.00084\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Autonomous driving technology aims to make driving decisions based on information about the vehicle’s environment. Navigation-based autonomous driving in urban scenarios has more complex scenarios than in relatively simple scenarios such as highways and parking lots, and is a task that still needs to be explored over time. Imitation learning models based on supervised learning methods are limited by the amount of expert data collected. Models based on reinforcement learning methods are able to interact with the environment, but are data inefficient and require a lot of exploration to learn effective policy. We propose a method that combines imitation learning with reinforcement learning enabling agent to achieve a higher success rate in urban autonomous driving navigation scenarios. To solve the problem of inefficient reinforcement learning data, our method decomposes the action space into low-level action space and high-level actin space, where low-level action space is multiple pre-trained imitation learning action space is a combination of several pre-trained imitation learning action spaces based on different control signals (i.e., follow, straight, turn right, turn left). High-level action space includes different control signals, the agent executes a specific imitation learning policy by selecting control signals from the high-level action space through a DQN-based reinforcement learning approach. Moreover, we propose a new reward for high level action selection. Experiments on the CARLA driving benchmark demonstrate that our approach outperforms both imitation learning methods and reinforcement learning methods on a variety of navigation-based driving tasks.\",\"PeriodicalId\":325790,\"journal\":{\"name\":\"2021 International Conference on Computer Engineering and Application (ICCEA)\",\"volume\":\"49 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 International Conference on Computer Engineering and Application (ICCEA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCEA53728.2021.00084\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Computer Engineering and Application (ICCEA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCEA53728.2021.00084","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

自动驾驶技术旨在根据车辆所处环境的信息做出驾驶决策。与高速公路和停车场等相对简单的场景相比，城市场景中基于导航的自动驾驶场景更为复杂，这是一项仍需长期探索的任务。基于监督学习方法的模仿学习模型受到专家数据收集量的限制。基于强化学习方法的模型能够与环境交互，但数据效率低下，需要大量的探索才能学习到有效的策略。我们提出了一种将模仿学习与强化学习相结合的方法，使智能体在城市自动驾驶导航场景中获得更高的成功率。为了解决强化学习数据效率低下的问题，我们的方法将动作空间分解为低级动作空间和高级动作空间，其中低级动作空间是多个预训练的模仿学习动作空间，是基于不同控制信号(即跟随、直行、右转、左转)的多个预训练的模仿学习动作空间的组合。高级动作空间包含不同的控制信号，智能体通过基于dqn的强化学习方法，从高级动作空间中选择控制信号，执行特定的模仿学习策略。此外，我们提出了一个新的奖励高层次的行动选择。在CARLA驾驶基准上的实验表明，我们的方法在各种基于导航的驾驶任务上优于模仿学习方法和强化学习方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Hierarchical Autonomous Driving Framework Combining Reinforcement Learning and Imitation Learning

Autonomous driving technology aims to make driving decisions based on information about the vehicle’s environment. Navigation-based autonomous driving in urban scenarios has more complex scenarios than in relatively simple scenarios such as highways and parking lots, and is a task that still needs to be explored over time. Imitation learning models based on supervised learning methods are limited by the amount of expert data collected. Models based on reinforcement learning methods are able to interact with the environment, but are data inefficient and require a lot of exploration to learn effective policy. We propose a method that combines imitation learning with reinforcement learning enabling agent to achieve a higher success rate in urban autonomous driving navigation scenarios. To solve the problem of inefficient reinforcement learning data, our method decomposes the action space into low-level action space and high-level actin space, where low-level action space is multiple pre-trained imitation learning action space is a combination of several pre-trained imitation learning action spaces based on different control signals (i.e., follow, straight, turn right, turn left). High-level action space includes different control signals, the agent executes a specific imitation learning policy by selecting control signals from the high-level action space through a DQN-based reinforcement learning approach. Moreover, we propose a new reward for high level action selection. Experiments on the CARLA driving benchmark demonstrate that our approach outperforms both imitation learning methods and reinforcement learning methods on a variety of navigation-based driving tasks.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 International Conference on Computer Engineering and Application (ICCEA)

自引率

0.00%

发文量