结合强化学习和模仿学习的分层自动驾驶框架

Zeyu Li
{"title":"结合强化学习和模仿学习的分层自动驾驶框架","authors":"Zeyu Li","doi":"10.1109/ICCEA53728.2021.00084","DOIUrl":null,"url":null,"abstract":"Autonomous driving technology aims to make driving decisions based on information about the vehicle’s environment. Navigation-based autonomous driving in urban scenarios has more complex scenarios than in relatively simple scenarios such as highways and parking lots, and is a task that still needs to be explored over time. Imitation learning models based on supervised learning methods are limited by the amount of expert data collected. Models based on reinforcement learning methods are able to interact with the environment, but are data inefficient and require a lot of exploration to learn effective policy. We propose a method that combines imitation learning with reinforcement learning enabling agent to achieve a higher success rate in urban autonomous driving navigation scenarios. To solve the problem of inefficient reinforcement learning data, our method decomposes the action space into low-level action space and high-level actin space, where low-level action space is multiple pre-trained imitation learning action space is a combination of several pre-trained imitation learning action spaces based on different control signals (i.e., follow, straight, turn right, turn left). High-level action space includes different control signals, the agent executes a specific imitation learning policy by selecting control signals from the high-level action space through a DQN-based reinforcement learning approach. Moreover, we propose a new reward for high level action selection. Experiments on the CARLA driving benchmark demonstrate that our approach outperforms both imitation learning methods and reinforcement learning methods on a variety of navigation-based driving tasks.","PeriodicalId":325790,"journal":{"name":"2021 International Conference on Computer Engineering and Application (ICCEA)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"A Hierarchical Autonomous Driving Framework Combining Reinforcement Learning and Imitation Learning\",\"authors\":\"Zeyu Li\",\"doi\":\"10.1109/ICCEA53728.2021.00084\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Autonomous driving technology aims to make driving decisions based on information about the vehicle’s environment. Navigation-based autonomous driving in urban scenarios has more complex scenarios than in relatively simple scenarios such as highways and parking lots, and is a task that still needs to be explored over time. Imitation learning models based on supervised learning methods are limited by the amount of expert data collected. Models based on reinforcement learning methods are able to interact with the environment, but are data inefficient and require a lot of exploration to learn effective policy. We propose a method that combines imitation learning with reinforcement learning enabling agent to achieve a higher success rate in urban autonomous driving navigation scenarios. To solve the problem of inefficient reinforcement learning data, our method decomposes the action space into low-level action space and high-level actin space, where low-level action space is multiple pre-trained imitation learning action space is a combination of several pre-trained imitation learning action spaces based on different control signals (i.e., follow, straight, turn right, turn left). High-level action space includes different control signals, the agent executes a specific imitation learning policy by selecting control signals from the high-level action space through a DQN-based reinforcement learning approach. Moreover, we propose a new reward for high level action selection. Experiments on the CARLA driving benchmark demonstrate that our approach outperforms both imitation learning methods and reinforcement learning methods on a variety of navigation-based driving tasks.\",\"PeriodicalId\":325790,\"journal\":{\"name\":\"2021 International Conference on Computer Engineering and Application (ICCEA)\",\"volume\":\"49 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 International Conference on Computer Engineering and Application (ICCEA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCEA53728.2021.00084\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Computer Engineering and Application (ICCEA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCEA53728.2021.00084","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

摘要

自动驾驶技术旨在根据车辆所处环境的信息做出驾驶决策。与高速公路和停车场等相对简单的场景相比,城市场景中基于导航的自动驾驶场景更为复杂,这是一项仍需长期探索的任务。基于监督学习方法的模仿学习模型受到专家数据收集量的限制。基于强化学习方法的模型能够与环境交互,但数据效率低下,需要大量的探索才能学习到有效的策略。我们提出了一种将模仿学习与强化学习相结合的方法,使智能体在城市自动驾驶导航场景中获得更高的成功率。为了解决强化学习数据效率低下的问题,我们的方法将动作空间分解为低级动作空间和高级动作空间,其中低级动作空间是多个预训练的模仿学习动作空间,是基于不同控制信号(即跟随、直行、右转、左转)的多个预训练的模仿学习动作空间的组合。高级动作空间包含不同的控制信号,智能体通过基于dqn的强化学习方法,从高级动作空间中选择控制信号,执行特定的模仿学习策略。此外,我们提出了一个新的奖励高层次的行动选择。在CARLA驾驶基准上的实验表明,我们的方法在各种基于导航的驾驶任务上优于模仿学习方法和强化学习方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A Hierarchical Autonomous Driving Framework Combining Reinforcement Learning and Imitation Learning
Autonomous driving technology aims to make driving decisions based on information about the vehicle’s environment. Navigation-based autonomous driving in urban scenarios has more complex scenarios than in relatively simple scenarios such as highways and parking lots, and is a task that still needs to be explored over time. Imitation learning models based on supervised learning methods are limited by the amount of expert data collected. Models based on reinforcement learning methods are able to interact with the environment, but are data inefficient and require a lot of exploration to learn effective policy. We propose a method that combines imitation learning with reinforcement learning enabling agent to achieve a higher success rate in urban autonomous driving navigation scenarios. To solve the problem of inefficient reinforcement learning data, our method decomposes the action space into low-level action space and high-level actin space, where low-level action space is multiple pre-trained imitation learning action space is a combination of several pre-trained imitation learning action spaces based on different control signals (i.e., follow, straight, turn right, turn left). High-level action space includes different control signals, the agent executes a specific imitation learning policy by selecting control signals from the high-level action space through a DQN-based reinforcement learning approach. Moreover, we propose a new reward for high level action selection. Experiments on the CARLA driving benchmark demonstrate that our approach outperforms both imitation learning methods and reinforcement learning methods on a variety of navigation-based driving tasks.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信