Zhaoqing Li , Silai Chen , Guosheng Xiao , Yangsheng Jiang , Zhihong Yao , Puxin Yang
{"title":"A heterogeneous agent reinforcement learning approach with curriculum learning for variable speed limit control","authors":"Zhaoqing Li , Silai Chen , Guosheng Xiao , Yangsheng Jiang , Zhihong Yao , Puxin Yang","doi":"10.1016/j.eswa.2025.129945","DOIUrl":null,"url":null,"abstract":"<div><div>The majority of existing Variable Speed Limit (VSL) studies employ rule-based control strategies, which often fail to accommodate diverse road characteristics and lack responsiveness to sudden congestion. These limitations are further compounded by computational inefficiencies, hindering real-time decision-making in large-scale or complex traffic environments. To address these issues, this study proposes a VSL strategy based on Heterogeneous Agent Reinforcement Learning with Curriculum Learning (HARLCL). Specifically, a top-level agent classifies road segments using the Mini-Batch K-means algorithm, thereby capturing the heterogeneity of each segment. The lower-level agents operate with class-specific observation spaces and action spaces, each agent observes and acts its feature set tailored to its segment class while sharing the same reward design and learning architecture. Subsequently, lower-level agents adaptively adjust both the position and length of each controlled segment according to classification results and real-time traffic conditions. Through a reward-driven mechanism, these agents continually refine their control precision. Moreover, curriculum learning is introduced during the multi-agent training process, effectively accelerating convergence and mitigating computational burdens typically encountered in large-scale reinforcement learning. Experiments were conducted in three task scenarios–typical fixed bottlenecks, random task bottlenecks, and multiple bottlenecks using realistic road data. The results indicate that training time decreased by 43.18 % and 47.35 %; in the SUMO microscopic simulation, total travel time was 312.25 veh·h, 43.05 % lower than NoVSL, and safe braking distances also decreased, indicating improved safety and more stable traffic flow. Compared with conventional feedback control and deep reinforcement learning approaches, the HARLCL-based strategy demonstrates substantial advantages in training efficiency and control precision, offering a promising avenue for practical VSL implementation.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"299 ","pages":"Article 129945"},"PeriodicalIF":7.5000,"publicationDate":"2025-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425035602","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
The majority of existing Variable Speed Limit (VSL) studies employ rule-based control strategies, which often fail to accommodate diverse road characteristics and lack responsiveness to sudden congestion. These limitations are further compounded by computational inefficiencies, hindering real-time decision-making in large-scale or complex traffic environments. To address these issues, this study proposes a VSL strategy based on Heterogeneous Agent Reinforcement Learning with Curriculum Learning (HARLCL). Specifically, a top-level agent classifies road segments using the Mini-Batch K-means algorithm, thereby capturing the heterogeneity of each segment. The lower-level agents operate with class-specific observation spaces and action spaces, each agent observes and acts its feature set tailored to its segment class while sharing the same reward design and learning architecture. Subsequently, lower-level agents adaptively adjust both the position and length of each controlled segment according to classification results and real-time traffic conditions. Through a reward-driven mechanism, these agents continually refine their control precision. Moreover, curriculum learning is introduced during the multi-agent training process, effectively accelerating convergence and mitigating computational burdens typically encountered in large-scale reinforcement learning. Experiments were conducted in three task scenarios–typical fixed bottlenecks, random task bottlenecks, and multiple bottlenecks using realistic road data. The results indicate that training time decreased by 43.18 % and 47.35 %; in the SUMO microscopic simulation, total travel time was 312.25 veh·h, 43.05 % lower than NoVSL, and safe braking distances also decreased, indicating improved safety and more stable traffic flow. Compared with conventional feedback control and deep reinforcement learning approaches, the HARLCL-based strategy demonstrates substantial advantages in training efficiency and control precision, offering a promising avenue for practical VSL implementation.
期刊介绍:
Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.