Behavioral-Adaptive Deep Q-Network for Autonomous Driving Decisions in Heavy Traffic

Transportation Research Record: Journal of the Transportation Research Board Pub Date : 2024-07-27 DOI:10.1177/03611981241262314

Zhicheng Liu, Hong Yu

{"title":"Behavioral-Adaptive Deep Q-Network for Autonomous Driving Decisions in Heavy Traffic","authors":"Zhicheng Liu, Hong Yu","doi":"10.1177/03611981241262314","DOIUrl":null,"url":null,"abstract":"Deep reinforcement learning (DRL) is confronted with the significant problem of sparse rewards for autonomous driving in heavy traffic because of the dynamic and diverse nature of the driving environment as well as the complexity of the driving task. To mitigate the impact of sparse rewards on the convergence process of DRL, this paper proposes a novel behavioral-adaptive deep Q-network (BaDQN) for autonomous driving decisions in heavy traffic. BaDQN applies the idea of task decomposition to the DRL process. To break down the complexity of the driving task and achieve shorter exploration paths, BaDQN divides the driving task into three subtasks: Lane-Changing, Posture-Adjustment, and Wheel-Holding. BaDQN uses the finite state machine (FSM) to model the collaborative relationship between different subtasks, and abstracts each subtask separately using the Markov decision process (MDP). We used the Carla simulator to conduct experiments in a specific heavy traffic scenario. Compared with previous methods, BaDQN achieves a longer safe driving distance and a higher success rate. To discuss the adaptability of BaDQN to changes in traffic density and traffic velocity, we also conducted two extended experiments, which fully demonstrated the performance stability of BaDQN.","PeriodicalId":309251,"journal":{"name":"Transportation Research Record: Journal of the Transportation Research Board","volume":"72 15","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transportation Research Record: Journal of the Transportation Research Board","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1177/03611981241262314","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Deep reinforcement learning (DRL) is confronted with the significant problem of sparse rewards for autonomous driving in heavy traffic because of the dynamic and diverse nature of the driving environment as well as the complexity of the driving task. To mitigate the impact of sparse rewards on the convergence process of DRL, this paper proposes a novel behavioral-adaptive deep Q-network (BaDQN) for autonomous driving decisions in heavy traffic. BaDQN applies the idea of task decomposition to the DRL process. To break down the complexity of the driving task and achieve shorter exploration paths, BaDQN divides the driving task into three subtasks: Lane-Changing, Posture-Adjustment, and Wheel-Holding. BaDQN uses the finite state machine (FSM) to model the collaborative relationship between different subtasks, and abstracts each subtask separately using the Markov decision process (MDP). We used the Carla simulator to conduct experiments in a specific heavy traffic scenario. Compared with previous methods, BaDQN achieves a longer safe driving distance and a higher success rate. To discuss the adaptability of BaDQN to changes in traffic density and traffic velocity, we also conducted two extended experiments, which fully demonstrated the performance stability of BaDQN.

查看原文本刊更多论文

行为自适应深度 Q 网络用于繁忙交通中的自动驾驶决策

由于驾驶环境的动态性和多样性以及驾驶任务的复杂性，深度强化学习（DRL）在繁忙交通中的自动驾驶中面临着奖励稀疏的重大问题。为了减轻奖励稀疏对 DRL 收敛过程的影响，本文提出了一种新型的行为自适应深度 Q 网络（BaDQN），用于重交通环境下的自动驾驶决策。BaDQN 将任务分解的思想应用于 DRL 过程。为了分解驾驶任务的复杂性并实现更短的探索路径，BaDQN 将驾驶任务分为三个子任务：车道变换、姿态调整和车轮保持。BaDQN 使用有限状态机（FSM）来模拟不同子任务之间的协作关系，并使用马尔可夫决策过程（MDP）对每个子任务进行单独抽象。我们使用 Carla 模拟器在特定的大交通场景中进行了实验。与之前的方法相比，BaDQN 实现了更长的安全驾驶距离和更高的成功率。为了讨论 BaDQN 对交通密度和交通速度变化的适应性，我们还进行了两次扩展实验，充分证明了 BaDQN 的性能稳定性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Transportation Research Record: Journal of the Transportation Research Board

自引率

0.00%

发文量