基于模型控制器辅助深度强化学习的城市cav智能生态驾驶控制

IF 8.4 1区工程技术 Q1 ENGINEERING, CIVIL

IEEE Transactions on Intelligent Transportation Systems Pub Date : 2025-04-22 DOI:10.1109/TITS.2025.3559916

Jie Li;Xiaodong Wu;Xianxu Bai;Yonggang Liu;Min Xu

{"title":"基于模型控制器辅助深度强化学习的城市cav智能生态驾驶控制","authors":"Jie Li;Xiaodong Wu;Xianxu Bai;Yonggang Liu;Min Xu","doi":"10.1109/TITS.2025.3559916","DOIUrl":null,"url":null,"abstract":"Eco-driving control for connected and automated vehicles (CAVs) aims to co-optimize energy efficiency, ride comfort, and travel time while adhering to safety regulations. Model-based eco-driving strategies have proven robust and effective in simplified traffic scenarios. However, their application to complex tasks incurs high computational costs due to their reliance on precise nonlinear models that accurately reflect real-world physical systems. Model free deep reinforcement learning (DRL) methods exhibit potential in addressing challenges presented by high-dimensional state/action spaces encountered in real-time CAV control. Nevertheless, they require extensive training data and time, and are susceptible to getting stuck in suboptimal solutions, especially in complex urban traffic scenarios. To leverage the advantages of both model-based controllers and DRL algorithms, this study develops a novel model based controller online assisted-twin delayed deep deterministic policy gradient algorithm (MCOA-TD3) algorithm. The proposed algorithm integrates imitation learning into the Vanilla TD3 agent. During training, the MCOA-TD3 agent can learn from demonstrations generated by a model predictive control-based expert controller. The performance of the proposed strategy is evaluated through simulations conducted in a dynamic traffic simulation scenario replicating the testfield of Hamburg, Germany. The results show that our proposed strategy improves energy efficiency and ride comfort while maintaining comparable driving times to the Vanilla TD3 strategy. Notably, compared with the Vanilla TD3 strategy, our proposed strategy demonstrates superior adaptability and online fine-tuning ability. These improvements make it more suitable for complex and dynamic real-world scenarios.","PeriodicalId":13416,"journal":{"name":"IEEE Transactions on Intelligent Transportation Systems","volume":"26 6","pages":"7624-7639"},"PeriodicalIF":8.4000,"publicationDate":"2025-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Intelligent Eco-Driving Control for Urban CAVs Using a Model-Based Controller Assisted Deep Reinforcement Learning\",\"authors\":\"Jie Li;Xiaodong Wu;Xianxu Bai;Yonggang Liu;Min Xu\",\"doi\":\"10.1109/TITS.2025.3559916\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Eco-driving control for connected and automated vehicles (CAVs) aims to co-optimize energy efficiency, ride comfort, and travel time while adhering to safety regulations. Model-based eco-driving strategies have proven robust and effective in simplified traffic scenarios. However, their application to complex tasks incurs high computational costs due to their reliance on precise nonlinear models that accurately reflect real-world physical systems. Model free deep reinforcement learning (DRL) methods exhibit potential in addressing challenges presented by high-dimensional state/action spaces encountered in real-time CAV control. Nevertheless, they require extensive training data and time, and are susceptible to getting stuck in suboptimal solutions, especially in complex urban traffic scenarios. To leverage the advantages of both model-based controllers and DRL algorithms, this study develops a novel model based controller online assisted-twin delayed deep deterministic policy gradient algorithm (MCOA-TD3) algorithm. The proposed algorithm integrates imitation learning into the Vanilla TD3 agent. During training, the MCOA-TD3 agent can learn from demonstrations generated by a model predictive control-based expert controller. The performance of the proposed strategy is evaluated through simulations conducted in a dynamic traffic simulation scenario replicating the testfield of Hamburg, Germany. The results show that our proposed strategy improves energy efficiency and ride comfort while maintaining comparable driving times to the Vanilla TD3 strategy. Notably, compared with the Vanilla TD3 strategy, our proposed strategy demonstrates superior adaptability and online fine-tuning ability. These improvements make it more suitable for complex and dynamic real-world scenarios.\",\"PeriodicalId\":13416,\"journal\":{\"name\":\"IEEE Transactions on Intelligent Transportation Systems\",\"volume\":\"26 6\",\"pages\":\"7624-7639\"},\"PeriodicalIF\":8.4000,\"publicationDate\":\"2025-04-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Intelligent Transportation Systems\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10974409/\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, CIVIL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Intelligent Transportation Systems","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10974409/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, CIVIL","Score":null,"Total":0}

引用次数: 0

摘要

网联和自动驾驶汽车（cav）的生态驾驶控制旨在共同优化能源效率、乘坐舒适性和行驶时间，同时遵守安全法规。基于模型的生态驾驶策略在简化的交通场景中已经被证明是稳健和有效的。然而，由于它们依赖精确的非线性模型来准确反映现实世界的物理系统，因此它们在复杂任务中的应用会产生很高的计算成本。无模型深度强化学习（DRL）方法在解决实时自动驾驶汽车控制中遇到的高维状态/动作空间所带来的挑战方面显示出潜力。然而，它们需要大量的训练数据和时间，并且容易陷入次优解决方案，特别是在复杂的城市交通场景中。为了利用基于模型的控制器和DRL算法的优点，本研究开发了一种新的基于模型的控制器在线辅助双延迟深度确定性策略梯度算法（MCOA-TD3）算法。该算法将模仿学习集成到Vanilla TD3智能体中。在训练过程中，MCOA-TD3智能体可以从基于模型预测控制的专家控制器生成的演示中学习。通过模拟德国汉堡试验场的动态交通模拟场景，对所提策略的性能进行了评估。结果表明，我们提出的策略提高了能源效率和乘坐舒适性，同时保持了与Vanilla TD3策略相当的驾驶时间。值得注意的是，与Vanilla TD3策略相比，我们提出的策略具有更好的适应性和在线微调能力。这些改进使其更适合复杂和动态的现实世界场景。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Intelligent Eco-Driving Control for Urban CAVs Using a Model-Based Controller Assisted Deep Reinforcement Learning

Eco-driving control for connected and automated vehicles (CAVs) aims to co-optimize energy efficiency, ride comfort, and travel time while adhering to safety regulations. Model-based eco-driving strategies have proven robust and effective in simplified traffic scenarios. However, their application to complex tasks incurs high computational costs due to their reliance on precise nonlinear models that accurately reflect real-world physical systems. Model free deep reinforcement learning (DRL) methods exhibit potential in addressing challenges presented by high-dimensional state/action spaces encountered in real-time CAV control. Nevertheless, they require extensive training data and time, and are susceptible to getting stuck in suboptimal solutions, especially in complex urban traffic scenarios. To leverage the advantages of both model-based controllers and DRL algorithms, this study develops a novel model based controller online assisted-twin delayed deep deterministic policy gradient algorithm (MCOA-TD3) algorithm. The proposed algorithm integrates imitation learning into the Vanilla TD3 agent. During training, the MCOA-TD3 agent can learn from demonstrations generated by a model predictive control-based expert controller. The performance of the proposed strategy is evaluated through simulations conducted in a dynamic traffic simulation scenario replicating the testfield of Hamburg, Germany. The results show that our proposed strategy improves energy efficiency and ride comfort while maintaining comparable driving times to the Vanilla TD3 strategy. Notably, compared with the Vanilla TD3 strategy, our proposed strategy demonstrates superior adaptability and online fine-tuning ability. These improvements make it more suitable for complex and dynamic real-world scenarios.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Intelligent Transportation Systems 工程技术-工程：电子与电气

CiteScore

14.80

自引率

12.90%

发文量

1872

审稿时长

7.5 months

期刊介绍： The theoretical, experimental and operational aspects of electrical and electronics engineering and information technologies as applied to Intelligent Transportation Systems (ITS). Intelligent Transportation Systems are defined as those systems utilizing synergistic technologies and systems engineering concepts to develop and improve transportation systems of all kinds. The scope of this interdisciplinary activity includes the promotion, consolidation and coordination of ITS technical activities among IEEE entities, and providing a focus for cooperative activities, both internally and externally.