驱动遵从性错误下稳健咨询的强化学习

IF 8.4 1区工程技术 Q1 ENGINEERING, CIVIL

IEEE Transactions on Intelligent Transportation Systems Pub Date : 2025-04-22 DOI:10.1109/TITS.2025.3550418

Jeongyun Kim;Jung-Hoon Cho;Cathy Wu

{"title":"驱动遵从性错误下稳健咨询的强化学习","authors":"Jeongyun Kim;Jung-Hoon Cho;Cathy Wu","doi":"10.1109/TITS.2025.3550418","DOIUrl":null,"url":null,"abstract":"There has been considerable interest in recent years regarding how a small fraction of autonomous vehicles (AVs) can mitigate traffic congestion. However, the reality of vehicle-based congestion mitigation remains elusive, due to challenges of cost, technology maturity, and regulation. As a result, recent works have investigated the necessity of autonomy by exploring driving advisory systems. Such early works have made simplifying assumptions such as perfect driver compliance. This work relaxes this assumption, focusing on compliance errors caused by physical limitations of human drivers, in particular, response delay and speed deviation. These compliance errors introduce significant unpredictability into traffic systems, complicating the design of real-time driving advisories aimed at stabilizing traffic flow. Our analysis reveals that performance degradation increases sharply under compliance errors, highlighting the associated difficulties. To address this challenge, we develop a reinforcement learning (RL) framework based on an action-persistent Markov decision process (MDP) combined with domain randomization, designed for robust coarse-grained driving policies. This approach allows driving policies to effectively manage the cumulative impacts of compliance errors by generating various scenarios and corresponding traffic conditions during training. We show that in comparison to prior RL-based work which did not consider compliance errors, our policies achieve up to 2.2 times improvement in average speed over non-robust training. In addition, analytical results validate the experiment results, highlighting the benefits of the proposed framework. Overall, this paper advocates the necessity of incorporating human driver compliance errors in the development of RL-based advisory systems, achieving more effective and resilient traffic management solutions.","PeriodicalId":13416,"journal":{"name":"IEEE Transactions on Intelligent Transportation Systems","volume":"26 6","pages":"7780-7791"},"PeriodicalIF":8.4000,"publicationDate":"2025-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Reinforcement Learning for Robust Advisories Under Driving Compliance Errors\",\"authors\":\"Jeongyun Kim;Jung-Hoon Cho;Cathy Wu\",\"doi\":\"10.1109/TITS.2025.3550418\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"There has been considerable interest in recent years regarding how a small fraction of autonomous vehicles (AVs) can mitigate traffic congestion. However, the reality of vehicle-based congestion mitigation remains elusive, due to challenges of cost, technology maturity, and regulation. As a result, recent works have investigated the necessity of autonomy by exploring driving advisory systems. Such early works have made simplifying assumptions such as perfect driver compliance. This work relaxes this assumption, focusing on compliance errors caused by physical limitations of human drivers, in particular, response delay and speed deviation. These compliance errors introduce significant unpredictability into traffic systems, complicating the design of real-time driving advisories aimed at stabilizing traffic flow. Our analysis reveals that performance degradation increases sharply under compliance errors, highlighting the associated difficulties. To address this challenge, we develop a reinforcement learning (RL) framework based on an action-persistent Markov decision process (MDP) combined with domain randomization, designed for robust coarse-grained driving policies. This approach allows driving policies to effectively manage the cumulative impacts of compliance errors by generating various scenarios and corresponding traffic conditions during training. We show that in comparison to prior RL-based work which did not consider compliance errors, our policies achieve up to 2.2 times improvement in average speed over non-robust training. In addition, analytical results validate the experiment results, highlighting the benefits of the proposed framework. Overall, this paper advocates the necessity of incorporating human driver compliance errors in the development of RL-based advisory systems, achieving more effective and resilient traffic management solutions.\",\"PeriodicalId\":13416,\"journal\":{\"name\":\"IEEE Transactions on Intelligent Transportation Systems\",\"volume\":\"26 6\",\"pages\":\"7780-7791\"},\"PeriodicalIF\":8.4000,\"publicationDate\":\"2025-04-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Intelligent Transportation Systems\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10974413/\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, CIVIL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Intelligent Transportation Systems","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10974413/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, CIVIL","Score":null,"Total":0}

引用次数: 0

摘要

近年来，人们对一小部分自动驾驶汽车（AVs）如何缓解交通拥堵非常感兴趣。然而，由于成本、技术成熟度和监管方面的挑战，基于车辆的拥堵缓解现实仍然难以实现。因此，最近的研究通过探索驾驶咨询系统来研究自动驾驶的必要性。这些早期的工作已经做出了简化的假设，比如完全符合驾驶员的要求。这项工作放松了这一假设，重点研究了人类驾驶员的物理限制导致的顺应性错误，特别是响应延迟和速度偏差。这些遵从性错误给交通系统带来了重大的不可预测性，使旨在稳定交通流量的实时驾驶建议的设计复杂化。我们的分析显示，在遵从性错误下，性能下降急剧增加，突出了相关的困难。为了应对这一挑战，我们开发了一个基于动作持续马尔可夫决策过程（MDP）和领域随机化的强化学习（RL）框架，专为鲁棒的粗粒度驱动策略而设计。该方法通过在训练过程中生成各种场景和相应的交通状况，使驾驶政策能够有效地管理合规错误的累积影响。我们表明，与之前不考虑遵从性错误的基于强化学习的工作相比，我们的策略比非鲁棒训练的平均速度提高了2.2倍。此外，分析结果验证了实验结果，突出了所提出框架的优点。总体而言，本文主张在基于rl的咨询系统开发中纳入人类驾驶员合规错误的必要性，以实现更有效和更有弹性的交通管理解决方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Reinforcement Learning for Robust Advisories Under Driving Compliance Errors

There has been considerable interest in recent years regarding how a small fraction of autonomous vehicles (AVs) can mitigate traffic congestion. However, the reality of vehicle-based congestion mitigation remains elusive, due to challenges of cost, technology maturity, and regulation. As a result, recent works have investigated the necessity of autonomy by exploring driving advisory systems. Such early works have made simplifying assumptions such as perfect driver compliance. This work relaxes this assumption, focusing on compliance errors caused by physical limitations of human drivers, in particular, response delay and speed deviation. These compliance errors introduce significant unpredictability into traffic systems, complicating the design of real-time driving advisories aimed at stabilizing traffic flow. Our analysis reveals that performance degradation increases sharply under compliance errors, highlighting the associated difficulties. To address this challenge, we develop a reinforcement learning (RL) framework based on an action-persistent Markov decision process (MDP) combined with domain randomization, designed for robust coarse-grained driving policies. This approach allows driving policies to effectively manage the cumulative impacts of compliance errors by generating various scenarios and corresponding traffic conditions during training. We show that in comparison to prior RL-based work which did not consider compliance errors, our policies achieve up to 2.2 times improvement in average speed over non-robust training. In addition, analytical results validate the experiment results, highlighting the benefits of the proposed framework. Overall, this paper advocates the necessity of incorporating human driver compliance errors in the development of RL-based advisory systems, achieving more effective and resilient traffic management solutions.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Intelligent Transportation Systems 工程技术-工程：电子与电气

CiteScore

14.80

自引率

12.90%

发文量

1872

审稿时长

7.5 months

期刊介绍： The theoretical, experimental and operational aspects of electrical and electronics engineering and information technologies as applied to Intelligent Transportation Systems (ITS). Intelligent Transportation Systems are defined as those systems utilizing synergistic technologies and systems engineering concepts to develop and improve transportation systems of all kinds. The scope of this interdisciplinary activity includes the promotion, consolidation and coordination of ITS technical activities among IEEE entities, and providing a focus for cooperative activities, both internally and externally.