Robust dynamic real-time control strategies for high-frequency bus service: a multi-agent reinforcement learning framework

IF 2.8 3区工程技术 Q3 TRANSPORTATION

Journal of Intelligent Transportation Systems Pub Date : 2026-01-02 Epub Date: 2024-11-10 DOI:10.1080/15472450.2024.2425293

Victor Jian Ming Low , Hooi Ling Khoo , Wooi Chen Khoo

{"title":"Robust dynamic real-time control strategies for high-frequency bus service: a multi-agent reinforcement learning framework","authors":"Victor Jian Ming Low , Hooi Ling Khoo , Wooi Chen Khoo","doi":"10.1080/15472450.2024.2425293","DOIUrl":null,"url":null,"abstract":"<div><div>This study addresses the multifaceted challenge of ensuring the regularity of bus services, minimizing bus bunching, and facilitating synchronized bus connections across routes. An enhanced multi-agent reinforcement learning algorithm, namely the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) algorithm, is proposed to implement real-time control strategies for addressing these issues simultaneously. The merit of the modified MADDPG algorithm lies in its ability to continuously learn while adeptly navigating the non-stationary operating nature of bus system networks. A case study of a bus corridor is used to train and test the algorithm. Four robust scenarios, each presenting varying degrees of travel time and dwell time variations, are designed to assess the algorithm’s robustness. Results indicate that the MADDPG algorithm can significantly increase the likelihood of synchronized bus transfers across multiple routes by two or three times while maintaining the service reliability on each route. Moreover, the flexibility of the MADDPG algorithm in training bus policies allows it to effectively adapt to up to 90% variations in bus travel times and demand changes, even amid disruptive events in real-world scenarios.</div></div>","PeriodicalId":54792,"journal":{"name":"Journal of Intelligent Transportation Systems","volume":"30 1","pages":"Pages 157-176"},"PeriodicalIF":2.8000,"publicationDate":"2026-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Intelligent Transportation Systems","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/org/science/article/pii/S1547245024000446","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/11/10 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"TRANSPORTATION","Score":null,"Total":0}

引用次数: 0

Abstract

This study addresses the multifaceted challenge of ensuring the regularity of bus services, minimizing bus bunching, and facilitating synchronized bus connections across routes. An enhanced multi-agent reinforcement learning algorithm, namely the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) algorithm, is proposed to implement real-time control strategies for addressing these issues simultaneously. The merit of the modified MADDPG algorithm lies in its ability to continuously learn while adeptly navigating the non-stationary operating nature of bus system networks. A case study of a bus corridor is used to train and test the algorithm. Four robust scenarios, each presenting varying degrees of travel time and dwell time variations, are designed to assess the algorithm’s robustness. Results indicate that the MADDPG algorithm can significantly increase the likelihood of synchronized bus transfers across multiple routes by two or three times while maintaining the service reliability on each route. Moreover, the flexibility of the MADDPG algorithm in training bus policies allows it to effectively adapt to up to 90% variations in bus travel times and demand changes, even amid disruptive events in real-world scenarios.

查看原文本刊更多论文

高频公交服务鲁棒动态实时控制策略：一个多智能体强化学习框架

本研究解决了多方面的挑战，包括确保巴士服务的规律性，减少巴士拥挤，以及促进巴士跨路线的同步连接。提出了一种增强型多智能体强化学习算法，即多智能体深度确定性策略梯度（multi-agent Deep Deterministic Policy Gradient， MADDPG）算法，以实现同时解决这些问题的实时控制策略。改进后的madpg算法的优点在于它具有持续学习的能力，同时能够熟练地驾驭公交系统网络的非平稳运行特性。以公交走廊为例，对该算法进行了训练和测试。设计了四个鲁棒场景，每个场景都呈现不同程度的旅行时间和停留时间变化，以评估算法的鲁棒性。结果表明，madpg算法在保证每条线路的服务可靠性的前提下，可将多路公交同步换乘的可能性显著提高2 ~ 3倍。此外，madpg算法在训练公交策略方面的灵活性使其能够有效地适应高达90%的公交行驶时间和需求变化，即使在现实世界的破坏性事件中也是如此。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Intelligent Transportation Systems 工程技术-运输科技

CiteScore

8.80

自引率

19.40%

发文量

审稿时长

15 months

期刊介绍： The Journal of Intelligent Transportation Systems is devoted to scholarly research on the development, planning, management, operation and evaluation of intelligent transportation systems. Intelligent transportation systems are innovative solutions that address contemporary transportation problems. They are characterized by information, dynamic feedback and automation that allow people and goods to move efficiently. They encompass the full scope of information technologies used in transportation, including control, computation and communication, as well as the algorithms, databases, models and human interfaces. The emergence of these technologies as a new pathway for transportation is relatively new. The Journal of Intelligent Transportation Systems is especially interested in research that leads to improved planning and operation of the transportation system through the application of new technologies. The journal is particularly interested in research that adds to the scientific understanding of the impacts that intelligent transportation systems can have on accessibility, congestion, pollution, safety, security, noise, and energy and resource consumption. The journal is inter-disciplinary, and accepts work from fields of engineering, economics, planning, policy, business and management, as well as any other disciplines that contribute to the scientific understanding of intelligent transportation systems. The journal is also multi-modal, and accepts work on intelligent transportation for all forms of ground, air and water transportation. Example topics include the role of information systems in transportation, traffic flow and control, vehicle control, routing and scheduling, traveler response to dynamic information, planning for ITS innovations, evaluations of ITS field operational tests, ITS deployment experiences, automated highway systems, vehicle control systems, diffusion of ITS, and tools/software for analysis of ITS.