Bridging phase and timing: A joint Q-value learning framework for synergistic traffic signal control at consecutive arterial road intersections

IF 3.1 3区物理与天体物理 Q2 PHYSICS, MULTIDISCIPLINARY

Physica A: Statistical Mechanics and its Applications Pub Date : 2026-04-15 Epub Date: 2026-02-25 DOI:10.1016/j.physa.2026.131421

Haoran Liu , Xiaohu Li , Luodan Zhang , Rongjun Cheng

{"title":"Bridging phase and timing: A joint Q-value learning framework for synergistic traffic signal control at consecutive arterial road intersections","authors":"Haoran Liu , Xiaohu Li , Luodan Zhang , Rongjun Cheng","doi":"10.1016/j.physa.2026.131421","DOIUrl":null,"url":null,"abstract":"<div><div>Reinforcement Learning (RL) is a highly effective traffic control method and shows strong potential. However, existing reinforcement learning-based methods for traffic signal control are insufficient in state modeling and cannot distinguish between the direct impact of queued vehicles on the intersection and the dynamic impact of moving vehicles. Additionally, they lack the ability to capture the deep synergistic relationship between signal phases and dynamic phase durations, which often leads to suboptimal decisions. This paper proposes an RL traffic signal control method based on joint Q-values, employing a detailed and predictive state representation that fully considers the different effects of queued and moving vehicles on intersection congestion. By combining vehicle speed and position features, the dynamic influence of moving vehicles on future traffic pressure is quantified. Meanwhile, the model effectively integrates lane and phase features using a multi-head attention mechanism, automatically capturing the conflict and cooperation relationships among different traffic flows. On this basis, a joint Q-value learning framework is adopted, treating signal phase selection and dynamic duration decisions as a complete decision unit for joint optimization. This directly learns the synergy between the two, thereby avoiding the problem of suboptimal decisions where efficient phases are chosen but with insufficient timing. Comprehensive experimental results in both real and synthetic scenarios show that our method achieves up to a 7.20% reduction in average travel time across various intersection settings, while also having a higher vehicle throughput. Moreover, the model converges to the optimal solution more quickly. This characteristic ensures excellent model performance while maintaining strong generalization capability.</div></div>","PeriodicalId":20152,"journal":{"name":"Physica A: Statistical Mechanics and its Applications","volume":"688 ","pages":"Article 131421"},"PeriodicalIF":3.1000,"publicationDate":"2026-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Physica A: Statistical Mechanics and its Applications","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0378437126001573","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2026/2/25 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"PHYSICS, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

Abstract

Reinforcement Learning (RL) is a highly effective traffic control method and shows strong potential. However, existing reinforcement learning-based methods for traffic signal control are insufficient in state modeling and cannot distinguish between the direct impact of queued vehicles on the intersection and the dynamic impact of moving vehicles. Additionally, they lack the ability to capture the deep synergistic relationship between signal phases and dynamic phase durations, which often leads to suboptimal decisions. This paper proposes an RL traffic signal control method based on joint Q-values, employing a detailed and predictive state representation that fully considers the different effects of queued and moving vehicles on intersection congestion. By combining vehicle speed and position features, the dynamic influence of moving vehicles on future traffic pressure is quantified. Meanwhile, the model effectively integrates lane and phase features using a multi-head attention mechanism, automatically capturing the conflict and cooperation relationships among different traffic flows. On this basis, a joint Q-value learning framework is adopted, treating signal phase selection and dynamic duration decisions as a complete decision unit for joint optimization. This directly learns the synergy between the two, thereby avoiding the problem of suboptimal decisions where efficient phases are chosen but with insufficient timing. Comprehensive experimental results in both real and synthetic scenarios show that our method achieves up to a 7.20% reduction in average travel time across various intersection settings, while also having a higher vehicle throughput. Moreover, the model converges to the optimal solution more quickly. This characteristic ensures excellent model performance while maintaining strong generalization capability.

查看原文本刊更多论文

桥接相位与配时：连续主干道交叉口协同交通信号控制的联合q值学习框架

强化学习（RL）是一种高效的交通控制方法，具有很强的应用潜力。然而，现有的基于强化学习的交通信号控制方法在状态建模方面存在不足，无法区分排队车辆对交叉口的直接影响和移动车辆的动态影响。此外，它们缺乏捕捉信号相位和动态相位持续时间之间深层协同关系的能力，这通常会导致次优决策。本文提出了一种基于联合q值的RL交通信号控制方法，该方法充分考虑了排队车辆和移动车辆对交叉口拥堵的不同影响，采用了详细的预测状态表示。结合车速和位置特征，量化了移动车辆对未来交通压力的动态影响。同时，该模型采用多头关注机制，有效整合车道与相位特征，自动捕捉不同交通流之间的冲突与合作关系。在此基础上，采用联合q值学习框架，将信号相位选择和动态持续时间决策作为一个完整的决策单元进行联合优化。这直接学习了两者之间的协同作用，从而避免了次优决策的问题，即选择了有效的阶段，但时间不足。真实和综合场景的综合实验结果表明，我们的方法在不同路口设置的平均行驶时间减少了7.20%，同时具有更高的车辆吞吐量。而且，该模型收敛到最优解的速度更快。这一特性保证了优秀的模型性能，同时保持了较强的泛化能力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Physica A: Statistical Mechanics and its Applications 物理-物理：综合

CiteScore

7.20

自引率

9.10%

发文量

852

审稿时长

6.6 months

期刊介绍： Physica A: Statistical Mechanics and its Applications Recognized by the European Physical Society Physica A publishes research in the field of statistical mechanics and its applications. Statistical mechanics sets out to explain the behaviour of macroscopic systems by studying the statistical properties of their microscopic constituents. Applications of the techniques of statistical mechanics are widespread, and include: applications to physical systems such as solids, liquids and gases; applications to chemical and biological systems (colloids, interfaces, complex fluids, polymers and biopolymers, cell physics); and other interdisciplinary applications to for instance biological, economical and sociological systems.