{"title":"A Hierarchical Autonomous Driving Framework Combining Reinforcement Learning and Imitation Learning","authors":"Zeyu Li","doi":"10.1109/ICCEA53728.2021.00084","DOIUrl":null,"url":null,"abstract":"Autonomous driving technology aims to make driving decisions based on information about the vehicle’s environment. Navigation-based autonomous driving in urban scenarios has more complex scenarios than in relatively simple scenarios such as highways and parking lots, and is a task that still needs to be explored over time. Imitation learning models based on supervised learning methods are limited by the amount of expert data collected. Models based on reinforcement learning methods are able to interact with the environment, but are data inefficient and require a lot of exploration to learn effective policy. We propose a method that combines imitation learning with reinforcement learning enabling agent to achieve a higher success rate in urban autonomous driving navigation scenarios. To solve the problem of inefficient reinforcement learning data, our method decomposes the action space into low-level action space and high-level actin space, where low-level action space is multiple pre-trained imitation learning action space is a combination of several pre-trained imitation learning action spaces based on different control signals (i.e., follow, straight, turn right, turn left). High-level action space includes different control signals, the agent executes a specific imitation learning policy by selecting control signals from the high-level action space through a DQN-based reinforcement learning approach. Moreover, we propose a new reward for high level action selection. Experiments on the CARLA driving benchmark demonstrate that our approach outperforms both imitation learning methods and reinforcement learning methods on a variety of navigation-based driving tasks.","PeriodicalId":325790,"journal":{"name":"2021 International Conference on Computer Engineering and Application (ICCEA)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Computer Engineering and Application (ICCEA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCEA53728.2021.00084","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
Autonomous driving technology aims to make driving decisions based on information about the vehicle’s environment. Navigation-based autonomous driving in urban scenarios has more complex scenarios than in relatively simple scenarios such as highways and parking lots, and is a task that still needs to be explored over time. Imitation learning models based on supervised learning methods are limited by the amount of expert data collected. Models based on reinforcement learning methods are able to interact with the environment, but are data inefficient and require a lot of exploration to learn effective policy. We propose a method that combines imitation learning with reinforcement learning enabling agent to achieve a higher success rate in urban autonomous driving navigation scenarios. To solve the problem of inefficient reinforcement learning data, our method decomposes the action space into low-level action space and high-level actin space, where low-level action space is multiple pre-trained imitation learning action space is a combination of several pre-trained imitation learning action spaces based on different control signals (i.e., follow, straight, turn right, turn left). High-level action space includes different control signals, the agent executes a specific imitation learning policy by selecting control signals from the high-level action space through a DQN-based reinforcement learning approach. Moreover, we propose a new reward for high level action selection. Experiments on the CARLA driving benchmark demonstrate that our approach outperforms both imitation learning methods and reinforcement learning methods on a variety of navigation-based driving tasks.