Reinforcement Learning Driving Strategy based on Auxiliary Task for Multi-Scenarios Autonomous Driving

2023 IEEE 12th Data Driven Control and Learning Systems Conference (DDCLS) Pub Date : 2023-05-12 DOI:10.1109/DDCLS58216.2023.10166271

Jingbo Sun, Xing Fang, Qichao Zhang

{"title":"Reinforcement Learning Driving Strategy based on Auxiliary Task for Multi-Scenarios Autonomous Driving","authors":"Jingbo Sun, Xing Fang, Qichao Zhang","doi":"10.1109/DDCLS58216.2023.10166271","DOIUrl":null,"url":null,"abstract":"Reinforcement learning (RL) has made great progress in autonomous driving applications. However, using one RL based driving policy for multi-scenarios autonomous driving is still challenging for RL in autonomous driving. There are different observations and reward measurements in different scenarios. At the same time, there is also the problem of multi-source heterogeneous observation in autonomous driving. To address the problems above, we propose a reinforcement learning framework based on the auxiliary task. Firstly, we designed a reward function to enable vehicles to learn safe and efficient strategies. Further, an auxiliary task is designed to learn the characteristics of different scenarios so that the ego agent can adopt different strategies for different scenarios. Finally, in order to handle the driving problem in multiple scenarios, we propose a representation network based on Multi-layer perceptron (MLP), Convolutional neural network (CNN), and Transformer networks to learn multi-source heterogeneous observation. The multi-source heterogeneous observation consists of the ego vehicle state, the bird's eye view (BEV) state and neighbour vehicle states. Experiments show that our method achieves a higher success rate compared to a popular reinforcement learning algorithm.","PeriodicalId":415532,"journal":{"name":"2023 IEEE 12th Data Driven Control and Learning Systems Conference (DDCLS)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE 12th Data Driven Control and Learning Systems Conference (DDCLS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DDCLS58216.2023.10166271","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Reinforcement learning (RL) has made great progress in autonomous driving applications. However, using one RL based driving policy for multi-scenarios autonomous driving is still challenging for RL in autonomous driving. There are different observations and reward measurements in different scenarios. At the same time, there is also the problem of multi-source heterogeneous observation in autonomous driving. To address the problems above, we propose a reinforcement learning framework based on the auxiliary task. Firstly, we designed a reward function to enable vehicles to learn safe and efficient strategies. Further, an auxiliary task is designed to learn the characteristics of different scenarios so that the ego agent can adopt different strategies for different scenarios. Finally, in order to handle the driving problem in multiple scenarios, we propose a representation network based on Multi-layer perceptron (MLP), Convolutional neural network (CNN), and Transformer networks to learn multi-source heterogeneous observation. The multi-source heterogeneous observation consists of the ego vehicle state, the bird's eye view (BEV) state and neighbour vehicle states. Experiments show that our method achieves a higher success rate compared to a popular reinforcement learning algorithm.

查看原文本刊更多论文

基于辅助任务的多场景自动驾驶强化学习策略

强化学习(RL)在自动驾驶应用中取得了很大的进展。然而，在多场景自动驾驶中使用一种基于强化学习的驾驶策略对强化学习来说仍然是一个挑战。在不同的情况下有不同的观察和奖励测量。同时，自动驾驶也存在多源异构观测的问题。为了解决上述问题，我们提出了一个基于辅助任务的强化学习框架。首先，我们设计了一个奖励函数，使车辆能够学习安全有效的策略。此外，还设计了一个辅助任务来学习不同场景的特征，使自我智能体可以针对不同的场景采取不同的策略。最后，为了处理多场景下的驾驶问题，我们提出了一种基于多层感知器(MLP)、卷积神经网络(CNN)和Transformer网络的表示网络来学习多源异构观测。多源异构观测包括自我车辆状态、鸟瞰图状态和相邻车辆状态。实验表明，与常用的强化学习算法相比，我们的方法取得了更高的成功率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2023 IEEE 12th Data Driven Control and Learning Systems Conference (DDCLS)

自引率

0.00%

发文量