Traffic signal control in mixed traffic environment based on advance decision and reinforcement learning

IF 2.7 4区工程技术 Q2 TRANSPORTATION SCIENCE & TECHNOLOGY

Transportation Safety and Environment Pub Date : 2022-11-30 DOI:10.1093/tse/tdac027

Yu Du, W. Shangguan, Linguo Chai

{"title":"Traffic signal control in mixed traffic environment based on advance decision and reinforcement learning","authors":"Yu Du, W. Shangguan, Linguo Chai","doi":"10.1093/tse/tdac027","DOIUrl":null,"url":null,"abstract":"\n Reinforcement learning-based traffic signal control systems (RLTSC) can enhance dynamic adaptability, save vehicle travelling time and promote intersection capacity. However, the existing RLTSC methods do not consider the driver's response time requirement, so the systems often face efficiency limitations and implementation difficulties. We propose the advance decision-making reinforcement learning traffic signal control (AD-RLTSC) algorithm to improve traffic efficiency while ensuring safety in mixed traffic environment. First, the relationship between the intersection perception range and the signal control period is established and the trust region state (TRS) is proposed. Then, the scalable state matrix is dynamically adjusted to decide the future signal light status. The decision will be displayed to the human-driven vehicles (HDVs) through the bi-countdown timer mechanism and sent to the nearby connected automated vehicles (CAVs) using the wireless network rather than be executed immediately. HDVs and CAVs optimize the driving speed based on the remaining green (or red) time. Besides, the Double Dueling Deep Q-learning Network algorithm is used for reinforcement learning training; a standardized reward is proposed to enhance the performance of intersection control and prioritized experience replay is adopted to improve sample utilization. The experimental results on vehicle micro-behaviour and traffic macro-efficiency showed that the proposed AD-RLTSC algorithm can simultaneously improve both traffic efficiency and traffic flow stability.","PeriodicalId":52804,"journal":{"name":"Transportation Safety and Environment","volume":" ","pages":""},"PeriodicalIF":2.7000,"publicationDate":"2022-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transportation Safety and Environment","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1093/tse/tdac027","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"TRANSPORTATION SCIENCE & TECHNOLOGY","Score":null,"Total":0}

引用次数: 1

Abstract

Reinforcement learning-based traffic signal control systems (RLTSC) can enhance dynamic adaptability, save vehicle travelling time and promote intersection capacity. However, the existing RLTSC methods do not consider the driver's response time requirement, so the systems often face efficiency limitations and implementation difficulties. We propose the advance decision-making reinforcement learning traffic signal control (AD-RLTSC) algorithm to improve traffic efficiency while ensuring safety in mixed traffic environment. First, the relationship between the intersection perception range and the signal control period is established and the trust region state (TRS) is proposed. Then, the scalable state matrix is dynamically adjusted to decide the future signal light status. The decision will be displayed to the human-driven vehicles (HDVs) through the bi-countdown timer mechanism and sent to the nearby connected automated vehicles (CAVs) using the wireless network rather than be executed immediately. HDVs and CAVs optimize the driving speed based on the remaining green (or red) time. Besides, the Double Dueling Deep Q-learning Network algorithm is used for reinforcement learning training; a standardized reward is proposed to enhance the performance of intersection control and prioritized experience replay is adopted to improve sample utilization. The experimental results on vehicle micro-behaviour and traffic macro-efficiency showed that the proposed AD-RLTSC algorithm can simultaneously improve both traffic efficiency and traffic flow stability.

查看原文本刊更多论文

基于超前决策和强化学习的混合交通环境下交通信号控制

基于强化学习的交通信号控制系统(RLTSC)可以增强动态适应性，节省车辆行驶时间，提高交叉口通行能力。然而，现有的RLTSC方法没有考虑驾驶员的响应时间要求，因此系统常常面临效率限制和实现困难。为了提高混合交通环境下的交通效率，同时保证交通安全，提出了一种超前决策强化学习交通信号控制(AD-RLTSC)算法。首先，建立交叉口感知范围与信号控制周期之间的关系，并提出信任域状态(TRS);然后，动态调整可伸缩状态矩阵来确定未来信号灯的状态。该决定将通过双倒数计时器机制显示给人类驾驶汽车(HDVs)，并通过无线网络发送给附近的联网自动驾驶汽车(cav)，而不是立即执行。自动驾驶汽车和自动驾驶汽车根据绿灯(或红灯)剩余时间来优化行驶速度。采用双Dueling深度Q-learning Network算法进行强化学习训练;提出了一种标准化的奖励方法来提高交叉口控制的性能，并采用了优先体验回放来提高样本利用率。车辆微观行为和交通宏观效率的实验结果表明，AD-RLTSC算法可以同时提高交通效率和交通流稳定性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊