MADDPG-GST for coordinated variable speed limit and ramp metering: A hybrid action deep reinforcement learning approach to bottleneck congestion mitigation
Tun Qiu, Pan Liu, Zhibin Li, Chengcheng Xu, Kailai Qiu, Shunchao Wang
{"title":"MADDPG-GST for coordinated variable speed limit and ramp metering: A hybrid action deep reinforcement learning approach to bottleneck congestion mitigation","authors":"Tun Qiu, Pan Liu, Zhibin Li, Chengcheng Xu, Kailai Qiu, Shunchao Wang","doi":"10.1016/j.physa.2025.131045","DOIUrl":null,"url":null,"abstract":"<div><div>Expressway merging bottlenecks are major sources of traffic congestion, where insufficient coordination among multiple traffic streams leads to severe flow disruptions. Although variable speed limits (VSL) and ramp metering (RM) are commonly used to mitigate congestion, their independent operation and mismatched control scopes often result in suboptimal outcomes. To address this, this study proposes a coordinated VSL–RM strategy based on multi-agent deep reinforcement learning. The control task is modeled as a Markov Decision Process (MDP), allowing joint policy learning between decentralized VSL and RM agents. A customized Multi-Agent Deep Deterministic Policy Gradient (MADDPG) algorithm is employed to dynamically optimize coordination policies. To bridge the gap between discrete VSL and continuous RM action spaces, the Gumbel-Softmax Trick (GST) is integrated into the learning process for differentiable hybrid action optimization. Additionally, a transfer learning mechanism is incorporated to ensure efficient policy adaptation across diverse traffic scenarios. Simulation results under varying demand levels show that the proposed strategy achieves 7.3 %–34.1 % improvements in traffic efficiency and stability compared to traditional methods. It also demonstrates strong transferability, reducing retraining time by up to 63.7 % and traffic delays by up to 62.7 %, while maintaining robust control under overspeed disturbances and control lag conditions.</div></div>","PeriodicalId":20152,"journal":{"name":"Physica A: Statistical Mechanics and its Applications","volume":"680 ","pages":"Article 131045"},"PeriodicalIF":3.1000,"publicationDate":"2025-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Physica A: Statistical Mechanics and its Applications","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0378437125006971","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PHYSICS, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Expressway merging bottlenecks are major sources of traffic congestion, where insufficient coordination among multiple traffic streams leads to severe flow disruptions. Although variable speed limits (VSL) and ramp metering (RM) are commonly used to mitigate congestion, their independent operation and mismatched control scopes often result in suboptimal outcomes. To address this, this study proposes a coordinated VSL–RM strategy based on multi-agent deep reinforcement learning. The control task is modeled as a Markov Decision Process (MDP), allowing joint policy learning between decentralized VSL and RM agents. A customized Multi-Agent Deep Deterministic Policy Gradient (MADDPG) algorithm is employed to dynamically optimize coordination policies. To bridge the gap between discrete VSL and continuous RM action spaces, the Gumbel-Softmax Trick (GST) is integrated into the learning process for differentiable hybrid action optimization. Additionally, a transfer learning mechanism is incorporated to ensure efficient policy adaptation across diverse traffic scenarios. Simulation results under varying demand levels show that the proposed strategy achieves 7.3 %–34.1 % improvements in traffic efficiency and stability compared to traditional methods. It also demonstrates strong transferability, reducing retraining time by up to 63.7 % and traffic delays by up to 62.7 %, while maintaining robust control under overspeed disturbances and control lag conditions.
期刊介绍:
Physica A: Statistical Mechanics and its Applications
Recognized by the European Physical Society
Physica A publishes research in the field of statistical mechanics and its applications.
Statistical mechanics sets out to explain the behaviour of macroscopic systems by studying the statistical properties of their microscopic constituents.
Applications of the techniques of statistical mechanics are widespread, and include: applications to physical systems such as solids, liquids and gases; applications to chemical and biological systems (colloids, interfaces, complex fluids, polymers and biopolymers, cell physics); and other interdisciplinary applications to for instance biological, economical and sociological systems.