基于异步强化学习的多交叉口信号控制

IF 1.8 4区工程技术 Q2 ENGINEERING, CIVIL

Journal of Advanced Transportation Pub Date : 2025-04-17 DOI:10.1155/atr/3890878

Jixiang Wang, Siqi Chen, Jing Wei, Boao Wang, Haiyang Yu

{"title":"基于异步强化学习的多交叉口信号控制","authors":"Jixiang Wang, Siqi Chen, Jing Wei, Boao Wang, Haiyang Yu","doi":"10.1155/atr/3890878","DOIUrl":null,"url":null,"abstract":"<div>\n <p>State-of-the-art theoretical models and new traffic signal control technologies are key guarantees for improving the management and safety performance of transportation systems, and multiagent reinforcement learning (MARL) methods have been widely applied in the field of signal control. Researchers in the transportation domain have effectively addressed the issues of poor convergence and suboptimal optimization encountered in RL for multi-intersection signal control scenarios by adopting the centralized training with decentralized execution (CTDE) approach. However, due to the heterogeneity among intersections, simply decomposing the global reward into a sum of intersection-level rewards is unreasonable, posing a challenge in balancing the interests of individual intersections and the entire road network. Additionally, the assumption that all intersections within the system make decisions synchronously is rather strong. Therefore, this paper proposes a distributed traffic model tailored for synchronous decision-making and, based on that, introduces an asynchronous decision-making traffic model according to decoupled intersection control. Simulation experiments show that the asynchronous decision-making method proposed in this paper not only improves the model convergence speed by at least 19% compared to the multiagent deep RL (MADRL) algorithm used for synchronous decision-making, but also improves the model by at least 10.5% in vehicle driving speed, maximum queue length, and average queue length within the decodable range (the traffic density is between 100 vehicles/km and 400 vehicles/km). In the same traffic scenario, the MADRL algorithm used for asynchronous decision-making has improved the average vehicle delay and average queue length by at least 55% compared to traditional arterial green wave control methods and adaptive control methods, and by at least 5% compared to SAC and A2C methods.</p>\n </div>","PeriodicalId":50259,"journal":{"name":"Journal of Advanced Transportation","volume":"2025 1","pages":""},"PeriodicalIF":1.8000,"publicationDate":"2025-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1155/atr/3890878","citationCount":"0","resultStr":"{\"title\":\"Multi-Intersection Signal Control Based on Asynchronous Reinforcement Learning\",\"authors\":\"Jixiang Wang, Siqi Chen, Jing Wei, Boao Wang, Haiyang Yu\",\"doi\":\"10.1155/atr/3890878\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n <p>State-of-the-art theoretical models and new traffic signal control technologies are key guarantees for improving the management and safety performance of transportation systems, and multiagent reinforcement learning (MARL) methods have been widely applied in the field of signal control. Researchers in the transportation domain have effectively addressed the issues of poor convergence and suboptimal optimization encountered in RL for multi-intersection signal control scenarios by adopting the centralized training with decentralized execution (CTDE) approach. However, due to the heterogeneity among intersections, simply decomposing the global reward into a sum of intersection-level rewards is unreasonable, posing a challenge in balancing the interests of individual intersections and the entire road network. Additionally, the assumption that all intersections within the system make decisions synchronously is rather strong. Therefore, this paper proposes a distributed traffic model tailored for synchronous decision-making and, based on that, introduces an asynchronous decision-making traffic model according to decoupled intersection control. Simulation experiments show that the asynchronous decision-making method proposed in this paper not only improves the model convergence speed by at least 19% compared to the multiagent deep RL (MADRL) algorithm used for synchronous decision-making, but also improves the model by at least 10.5% in vehicle driving speed, maximum queue length, and average queue length within the decodable range (the traffic density is between 100 vehicles/km and 400 vehicles/km). In the same traffic scenario, the MADRL algorithm used for asynchronous decision-making has improved the average vehicle delay and average queue length by at least 55% compared to traditional arterial green wave control methods and adaptive control methods, and by at least 5% compared to SAC and A2C methods.</p>\\n </div>\",\"PeriodicalId\":50259,\"journal\":{\"name\":\"Journal of Advanced Transportation\",\"volume\":\"2025 1\",\"pages\":\"\"},\"PeriodicalIF\":1.8000,\"publicationDate\":\"2025-04-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1155/atr/3890878\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Advanced Transportation\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1155/atr/3890878\",\"RegionNum\":4,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, CIVIL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Advanced Transportation","FirstCategoryId":"5","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1155/atr/3890878","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, CIVIL","Score":null,"Total":0}

引用次数: 0

摘要

先进的理论模型和新的交通信号控制技术是提高交通系统管理和安全性能的关键保证，多智能体强化学习（MARL）方法在交通信号控制领域得到了广泛应用。交通领域的研究人员采用集中训练分散执行（CTDE）方法，有效地解决了RL在多交叉口信号控制场景下的收敛性差和次优优化问题。然而，由于交叉口之间的异质性，简单地将全局奖励分解为交叉口级奖励的总和是不合理的，这给平衡单个交叉口和整个路网的利益带来了挑战。此外，系统中的所有交叉点同步做出决策的假设是相当强的。为此，本文提出了针对同步决策的分布式交通模型，并在此基础上引入了基于解耦交叉口控制的异步决策交通模型。仿真实验表明，与用于同步决策的多智能体深度RL （MADRL）算法相比，本文提出的异步决策方法不仅使模型收敛速度提高了至少19%，而且在可解码范围内（交通密度在100 ~ 400辆/km之间）车辆行驶速度、最大队列长度和平均队列长度也使模型提高了至少10.5%。在相同的交通场景下，MADRL算法用于异步决策，与传统的动脉绿波控制方法和自适应控制方法相比，平均车辆延迟和平均队列长度至少提高55%，与SAC和A2C方法相比，至少提高5%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Multi-Intersection Signal Control Based on Asynchronous Reinforcement Learning

查看原文本刊更多论文

Multi-Intersection Signal Control Based on Asynchronous Reinforcement Learning

State-of-the-art theoretical models and new traffic signal control technologies are key guarantees for improving the management and safety performance of transportation systems, and multiagent reinforcement learning (MARL) methods have been widely applied in the field of signal control. Researchers in the transportation domain have effectively addressed the issues of poor convergence and suboptimal optimization encountered in RL for multi-intersection signal control scenarios by adopting the centralized training with decentralized execution (CTDE) approach. However, due to the heterogeneity among intersections, simply decomposing the global reward into a sum of intersection-level rewards is unreasonable, posing a challenge in balancing the interests of individual intersections and the entire road network. Additionally, the assumption that all intersections within the system make decisions synchronously is rather strong. Therefore, this paper proposes a distributed traffic model tailored for synchronous decision-making and, based on that, introduces an asynchronous decision-making traffic model according to decoupled intersection control. Simulation experiments show that the asynchronous decision-making method proposed in this paper not only improves the model convergence speed by at least 19% compared to the multiagent deep RL (MADRL) algorithm used for synchronous decision-making, but also improves the model by at least 10.5% in vehicle driving speed, maximum queue length, and average queue length within the decodable range (the traffic density is between 100 vehicles/km and 400 vehicles/km). In the same traffic scenario, the MADRL algorithm used for asynchronous decision-making has improved the average vehicle delay and average queue length by at least 55% compared to traditional arterial green wave control methods and adaptive control methods, and by at least 5% compared to SAC and A2C methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Advanced Transportation 工程技术-工程：土木

CiteScore

5.00

自引率

8.70%

发文量

466

审稿时长

7.3 months

期刊介绍： The Journal of Advanced Transportation (JAT) is a fully peer reviewed international journal in transportation research areas related to public transit, road traffic, transport networks and air transport. It publishes theoretical and innovative papers on analysis, design, operations, optimization and planning of multi-modal transport networks, transit & traffic systems, transport technology and traffic safety. Urban rail and bus systems, Pedestrian studies, traffic flow theory and control, Intelligent Transport Systems (ITS) and automated and/or connected vehicles are some topics of interest. Highway engineering, railway engineering and logistics do not fall within the aims and scope of JAT.