{"title":"Reinforcement Learning for Robust Advisories Under Driving Compliance Errors","authors":"Jeongyun Kim;Jung-Hoon Cho;Cathy Wu","doi":"10.1109/TITS.2025.3550418","DOIUrl":null,"url":null,"abstract":"There has been considerable interest in recent years regarding how a small fraction of autonomous vehicles (AVs) can mitigate traffic congestion. However, the reality of vehicle-based congestion mitigation remains elusive, due to challenges of cost, technology maturity, and regulation. As a result, recent works have investigated the necessity of autonomy by exploring driving advisory systems. Such early works have made simplifying assumptions such as perfect driver compliance. This work relaxes this assumption, focusing on compliance errors caused by physical limitations of human drivers, in particular, response delay and speed deviation. These compliance errors introduce significant unpredictability into traffic systems, complicating the design of real-time driving advisories aimed at stabilizing traffic flow. Our analysis reveals that performance degradation increases sharply under compliance errors, highlighting the associated difficulties. To address this challenge, we develop a reinforcement learning (RL) framework based on an action-persistent Markov decision process (MDP) combined with domain randomization, designed for robust coarse-grained driving policies. This approach allows driving policies to effectively manage the cumulative impacts of compliance errors by generating various scenarios and corresponding traffic conditions during training. We show that in comparison to prior RL-based work which did not consider compliance errors, our policies achieve up to 2.2 times improvement in average speed over non-robust training. In addition, analytical results validate the experiment results, highlighting the benefits of the proposed framework. Overall, this paper advocates the necessity of incorporating human driver compliance errors in the development of RL-based advisory systems, achieving more effective and resilient traffic management solutions.","PeriodicalId":13416,"journal":{"name":"IEEE Transactions on Intelligent Transportation Systems","volume":"26 6","pages":"7780-7791"},"PeriodicalIF":8.4000,"publicationDate":"2025-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Intelligent Transportation Systems","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10974413/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, CIVIL","Score":null,"Total":0}
引用次数: 0
Abstract
There has been considerable interest in recent years regarding how a small fraction of autonomous vehicles (AVs) can mitigate traffic congestion. However, the reality of vehicle-based congestion mitigation remains elusive, due to challenges of cost, technology maturity, and regulation. As a result, recent works have investigated the necessity of autonomy by exploring driving advisory systems. Such early works have made simplifying assumptions such as perfect driver compliance. This work relaxes this assumption, focusing on compliance errors caused by physical limitations of human drivers, in particular, response delay and speed deviation. These compliance errors introduce significant unpredictability into traffic systems, complicating the design of real-time driving advisories aimed at stabilizing traffic flow. Our analysis reveals that performance degradation increases sharply under compliance errors, highlighting the associated difficulties. To address this challenge, we develop a reinforcement learning (RL) framework based on an action-persistent Markov decision process (MDP) combined with domain randomization, designed for robust coarse-grained driving policies. This approach allows driving policies to effectively manage the cumulative impacts of compliance errors by generating various scenarios and corresponding traffic conditions during training. We show that in comparison to prior RL-based work which did not consider compliance errors, our policies achieve up to 2.2 times improvement in average speed over non-robust training. In addition, analytical results validate the experiment results, highlighting the benefits of the proposed framework. Overall, this paper advocates the necessity of incorporating human driver compliance errors in the development of RL-based advisory systems, achieving more effective and resilient traffic management solutions.
期刊介绍:
The theoretical, experimental and operational aspects of electrical and electronics engineering and information technologies as applied to Intelligent Transportation Systems (ITS). Intelligent Transportation Systems are defined as those systems utilizing synergistic technologies and systems engineering concepts to develop and improve transportation systems of all kinds. The scope of this interdisciplinary activity includes the promotion, consolidation and coordination of ITS technical activities among IEEE entities, and providing a focus for cooperative activities, both internally and externally.