Tun Qiu, Pan Liu, Zhibin Li, Chengcheng Xu, Kailai Qiu, Shunchao Wang
{"title":"协调可变限速和匝道计量的madpg - gst:一种缓解瓶颈拥堵的混合行动深度强化学习方法","authors":"Tun Qiu, Pan Liu, Zhibin Li, Chengcheng Xu, Kailai Qiu, Shunchao Wang","doi":"10.1016/j.physa.2025.131045","DOIUrl":null,"url":null,"abstract":"<div><div>Expressway merging bottlenecks are major sources of traffic congestion, where insufficient coordination among multiple traffic streams leads to severe flow disruptions. Although variable speed limits (VSL) and ramp metering (RM) are commonly used to mitigate congestion, their independent operation and mismatched control scopes often result in suboptimal outcomes. To address this, this study proposes a coordinated VSL–RM strategy based on multi-agent deep reinforcement learning. The control task is modeled as a Markov Decision Process (MDP), allowing joint policy learning between decentralized VSL and RM agents. A customized Multi-Agent Deep Deterministic Policy Gradient (MADDPG) algorithm is employed to dynamically optimize coordination policies. To bridge the gap between discrete VSL and continuous RM action spaces, the Gumbel-Softmax Trick (GST) is integrated into the learning process for differentiable hybrid action optimization. Additionally, a transfer learning mechanism is incorporated to ensure efficient policy adaptation across diverse traffic scenarios. Simulation results under varying demand levels show that the proposed strategy achieves 7.3 %–34.1 % improvements in traffic efficiency and stability compared to traditional methods. It also demonstrates strong transferability, reducing retraining time by up to 63.7 % and traffic delays by up to 62.7 %, while maintaining robust control under overspeed disturbances and control lag conditions.</div></div>","PeriodicalId":20152,"journal":{"name":"Physica A: Statistical Mechanics and its Applications","volume":"680 ","pages":"Article 131045"},"PeriodicalIF":3.1000,"publicationDate":"2025-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MADDPG-GST for coordinated variable speed limit and ramp metering: A hybrid action deep reinforcement learning approach to bottleneck congestion mitigation\",\"authors\":\"Tun Qiu, Pan Liu, Zhibin Li, Chengcheng Xu, Kailai Qiu, Shunchao Wang\",\"doi\":\"10.1016/j.physa.2025.131045\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Expressway merging bottlenecks are major sources of traffic congestion, where insufficient coordination among multiple traffic streams leads to severe flow disruptions. Although variable speed limits (VSL) and ramp metering (RM) are commonly used to mitigate congestion, their independent operation and mismatched control scopes often result in suboptimal outcomes. To address this, this study proposes a coordinated VSL–RM strategy based on multi-agent deep reinforcement learning. The control task is modeled as a Markov Decision Process (MDP), allowing joint policy learning between decentralized VSL and RM agents. A customized Multi-Agent Deep Deterministic Policy Gradient (MADDPG) algorithm is employed to dynamically optimize coordination policies. To bridge the gap between discrete VSL and continuous RM action spaces, the Gumbel-Softmax Trick (GST) is integrated into the learning process for differentiable hybrid action optimization. Additionally, a transfer learning mechanism is incorporated to ensure efficient policy adaptation across diverse traffic scenarios. Simulation results under varying demand levels show that the proposed strategy achieves 7.3 %–34.1 % improvements in traffic efficiency and stability compared to traditional methods. It also demonstrates strong transferability, reducing retraining time by up to 63.7 % and traffic delays by up to 62.7 %, while maintaining robust control under overspeed disturbances and control lag conditions.</div></div>\",\"PeriodicalId\":20152,\"journal\":{\"name\":\"Physica A: Statistical Mechanics and its Applications\",\"volume\":\"680 \",\"pages\":\"Article 131045\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2025-10-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Physica A: Statistical Mechanics and its Applications\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0378437125006971\",\"RegionNum\":3,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"PHYSICS, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Physica A: Statistical Mechanics and its Applications","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0378437125006971","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PHYSICS, MULTIDISCIPLINARY","Score":null,"Total":0}
MADDPG-GST for coordinated variable speed limit and ramp metering: A hybrid action deep reinforcement learning approach to bottleneck congestion mitigation
Expressway merging bottlenecks are major sources of traffic congestion, where insufficient coordination among multiple traffic streams leads to severe flow disruptions. Although variable speed limits (VSL) and ramp metering (RM) are commonly used to mitigate congestion, their independent operation and mismatched control scopes often result in suboptimal outcomes. To address this, this study proposes a coordinated VSL–RM strategy based on multi-agent deep reinforcement learning. The control task is modeled as a Markov Decision Process (MDP), allowing joint policy learning between decentralized VSL and RM agents. A customized Multi-Agent Deep Deterministic Policy Gradient (MADDPG) algorithm is employed to dynamically optimize coordination policies. To bridge the gap between discrete VSL and continuous RM action spaces, the Gumbel-Softmax Trick (GST) is integrated into the learning process for differentiable hybrid action optimization. Additionally, a transfer learning mechanism is incorporated to ensure efficient policy adaptation across diverse traffic scenarios. Simulation results under varying demand levels show that the proposed strategy achieves 7.3 %–34.1 % improvements in traffic efficiency and stability compared to traditional methods. It also demonstrates strong transferability, reducing retraining time by up to 63.7 % and traffic delays by up to 62.7 %, while maintaining robust control under overspeed disturbances and control lag conditions.
期刊介绍:
Physica A: Statistical Mechanics and its Applications
Recognized by the European Physical Society
Physica A publishes research in the field of statistical mechanics and its applications.
Statistical mechanics sets out to explain the behaviour of macroscopic systems by studying the statistical properties of their microscopic constituents.
Applications of the techniques of statistical mechanics are widespread, and include: applications to physical systems such as solids, liquids and gases; applications to chemical and biological systems (colloids, interfaces, complex fluids, polymers and biopolymers, cell physics); and other interdisciplinary applications to for instance biological, economical and sociological systems.