Single-Agent Reinforcement Learning Model for Adaptive Traffic Signal Control in Urban Corridors

IF 1.8 4区工程技术 Q2 ENGINEERING, CIVIL

Journal of Advanced Transportation Pub Date : 2026-03-24 DOI:10.1155/atr/5134018

Qiang Li, Xiya Zhuang, Lishan Liu, Bokui Chen, Yi Zhang, Yingping Zhao

{"title":"Single-Agent Reinforcement Learning Model for Adaptive Traffic Signal Control in Urban Corridors","authors":"Qiang Li, Xiya Zhuang, Lishan Liu, Bokui Chen, Yi Zhang, Yingping Zhao","doi":"10.1155/atr/5134018","DOIUrl":null,"url":null,"abstract":"<p>For multi-intersection control, existing research mainly adopts multiagent frameworks to tackle scalability issues. However, the traffic signal control (TSC) problem necessitates a single-agent framework, as a single control center monitors traffic conditions across all roads in the study area and coordinates the control of all intersections. This work proposes a novel single-agent RL-based urban corridor ATSC model: It abandons complex multiagent coordination and uses a single agent to centrally orchestrate signal timings across multiple intersections. Notably, the model is highly applicable to real-world settings. It defines state and reward functions based on a queue length metric—one that correlates with congestion and can be reliably estimated using probe vehicle data. Since probe vehicle data have become highly prevalent, this feature enables rapid, large-scale deployment. The single-agent framework primarily relies on a unique design of state, action, and reward. To facilitate learning and manage congestion, both state and reward functions are defined based on queue length, with actions designed to modulate queue dynamics. The queue length definition used in this study deviates slightly from conventional definitions but is closely correlated with congestion states. The method was comprehensively evaluated using the SUMO simulation platform under various traffic patterns. Experimental results show that the PPO algorithm demonstrates significantly faster learning than the DQN algorithm. The model effectively alleviates urban corridor congestion through coordinated multi-intersection control: During the simulation period of the entire scenario, the queue length did not exceed 50 vehicles, and instances where it exceeded 30 vehicles were relatively rare. Compared with the baseline scenario where queue lengths exceeded 150 vehicles, the proposed method significantly reduces road congestion. The work in this paper demonstrates the feasibility of controlling multiple intersections under a single-agent framework, and the control scope will be further expanded in the future.</p>","PeriodicalId":50259,"journal":{"name":"Journal of Advanced Transportation","volume":"2026 1","pages":""},"PeriodicalIF":1.8000,"publicationDate":"2026-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1155/atr/5134018","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Advanced Transportation","FirstCategoryId":"5","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1155/atr/5134018","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, CIVIL","Score":null,"Total":0}

引用次数: 0

Abstract

For multi-intersection control, existing research mainly adopts multiagent frameworks to tackle scalability issues. However, the traffic signal control (TSC) problem necessitates a single-agent framework, as a single control center monitors traffic conditions across all roads in the study area and coordinates the control of all intersections. This work proposes a novel single-agent RL-based urban corridor ATSC model: It abandons complex multiagent coordination and uses a single agent to centrally orchestrate signal timings across multiple intersections. Notably, the model is highly applicable to real-world settings. It defines state and reward functions based on a queue length metric—one that correlates with congestion and can be reliably estimated using probe vehicle data. Since probe vehicle data have become highly prevalent, this feature enables rapid, large-scale deployment. The single-agent framework primarily relies on a unique design of state, action, and reward. To facilitate learning and manage congestion, both state and reward functions are defined based on queue length, with actions designed to modulate queue dynamics. The queue length definition used in this study deviates slightly from conventional definitions but is closely correlated with congestion states. The method was comprehensively evaluated using the SUMO simulation platform under various traffic patterns. Experimental results show that the PPO algorithm demonstrates significantly faster learning than the DQN algorithm. The model effectively alleviates urban corridor congestion through coordinated multi-intersection control: During the simulation period of the entire scenario, the queue length did not exceed 50 vehicles, and instances where it exceeded 30 vehicles were relatively rare. Compared with the baseline scenario where queue lengths exceeded 150 vehicles, the proposed method significantly reduces road congestion. The work in this paper demonstrates the feasibility of controlling multiple intersections under a single-agent framework, and the control scope will be further expanded in the future.

Abstract Image

查看原文本刊更多论文

城市走廊自适应交通信号控制的单智能体强化学习模型

对于多交叉口控制，现有研究主要采用多智能体框架来解决可扩展性问题。然而，交通信号控制（TSC）问题需要一个单智能体框架，因为一个控制中心监控研究区域内所有道路的交通状况并协调所有交叉路口的控制。这项工作提出了一种新的基于单智能体rl的城市走廊ATSC模型：它放弃了复杂的多智能体协调，并使用单个智能体集中协调多个十字路口的信号时序。值得注意的是，该模型非常适用于现实世界的设置。它根据队列长度指标定义状态和奖励函数，该指标与拥塞相关，可以使用探测车辆数据可靠地估计。由于探测车辆数据已经变得非常普遍，因此该功能可以实现快速、大规模的部署。单代理框架主要依赖于状态、行为和奖励的独特设计。为了便于学习和管理拥塞，根据队列长度定义了状态和奖励函数，并设计了调整队列动态的操作。本研究中使用的队列长度定义与传统定义略有偏差，但与拥塞状态密切相关。利用SUMO仿真平台对该方法进行了多种交通模式下的综合评价。实验结果表明，PPO算法的学习速度明显快于DQN算法。该模型通过多交叉口协同控制，有效缓解了城市廊道拥堵：在整个场景仿真期间，队列长度不超过50辆，超过30辆的情况相对较少。与排队长度超过150辆车辆的基线情况相比，所提出的方法显著减少了道路拥堵。本文的工作证明了在单智能体框架下对多个交叉口进行控制的可行性，并将在未来进一步扩大控制范围。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Advanced Transportation 工程技术-工程：土木

CiteScore

5.00

自引率

8.70%

发文量

466

审稿时长

7.3 months

期刊介绍： The Journal of Advanced Transportation (JAT) is a fully peer reviewed international journal in transportation research areas related to public transit, road traffic, transport networks and air transport. It publishes theoretical and innovative papers on analysis, design, operations, optimization and planning of multi-modal transport networks, transit & traffic systems, transport technology and traffic safety. Urban rail and bus systems, Pedestrian studies, traffic flow theory and control, Intelligent Transport Systems (ITS) and automated and/or connected vehicles are some topics of interest. Highway engineering, railway engineering and logistics do not fall within the aims and scope of JAT.