好奇心驱动的强化学习与图形转换器，用于联网和自动驾驶汽车的决策

IF 7.6 1区工程技术 Q1 TRANSPORTATION SCIENCE & TECHNOLOGY

Transportation Research Part C-Emerging Technologies Pub Date : 2025-06-03 DOI:10.1016/j.trc.2025.105183

Qi Liu , Yujie Tang , Xueyuan Li , Kaifeng Wang , Fan Yang , Zirui Li

{"title":"好奇心驱动的强化学习与图形转换器，用于联网和自动驾驶汽车的决策","authors":"Qi Liu , Yujie Tang , Xueyuan Li , Kaifeng Wang , Fan Yang , Zirui Li","doi":"10.1016/j.trc.2025.105183","DOIUrl":null,"url":null,"abstract":"<div><div>Cooperative decision-making technology for connected and autonomous vehicles (CAVs) in mixed autonomy traffic is critical for the advancement of modern intelligent transportation systems. Recently, graph reinforcement learning (GRL) approaches have shown remarkable success in addressing decision-making challenges by leveraging graph-based technologies. However, existing GRL-based research faces substantial challenges in generating accurate feature embeddings to enhance driving policies, thoroughly exploring the driving environment, and efficiently training models. To address these challenges, this paper proposes a graph transformer reinforcement learning method with a distributional curiosity mechanism to improve the feature generation efficiency and environment exploration, ultimately boosting the decision-making performance of CAVs. First, an improved transformed graph convolutional network (ITransGCN) is proposed, integrating graph convolutional network (GCN), rotary position encoding method (ROPE), and temporal prior attention mechanism to strengthen sequential modeling capabilities, thereby generating informative spatial–temporal feature embeddings. Then, a curiosity mechanism based on distributional random network distillation (DRND) is proposed to enhance the exploratory capabilities of CAVs in driving environments. Additionally, a temporal integrated deep reinforcement learning (TI-DRL) model is developed, incorporating an auxiliary loss that integrates spatial–temporal information to improve the model’s ability to capture the spatial–temporal dependencies. Finally, a cooperation-aware reward function is constructed to further evaluate the performance of CAVs. Comprehensive experiments are conducted across three representative traffic scenarios to validate the proposed method. The results demonstrate that our proposed method outperforms the baselines in driving safety, efficiency, and model stability, highlighting the effectiveness of the core components and the generalization capability of the proposed method.</div></div>","PeriodicalId":54417,"journal":{"name":"Transportation Research Part C-Emerging Technologies","volume":"177 ","pages":"Article 105183"},"PeriodicalIF":7.6000,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Curiosity-driven reinforcement learning with graph transformers for decision-making in connected and autonomous vehicles\",\"authors\":\"Qi Liu , Yujie Tang , Xueyuan Li , Kaifeng Wang , Fan Yang , Zirui Li\",\"doi\":\"10.1016/j.trc.2025.105183\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Cooperative decision-making technology for connected and autonomous vehicles (CAVs) in mixed autonomy traffic is critical for the advancement of modern intelligent transportation systems. Recently, graph reinforcement learning (GRL) approaches have shown remarkable success in addressing decision-making challenges by leveraging graph-based technologies. However, existing GRL-based research faces substantial challenges in generating accurate feature embeddings to enhance driving policies, thoroughly exploring the driving environment, and efficiently training models. To address these challenges, this paper proposes a graph transformer reinforcement learning method with a distributional curiosity mechanism to improve the feature generation efficiency and environment exploration, ultimately boosting the decision-making performance of CAVs. First, an improved transformed graph convolutional network (ITransGCN) is proposed, integrating graph convolutional network (GCN), rotary position encoding method (ROPE), and temporal prior attention mechanism to strengthen sequential modeling capabilities, thereby generating informative spatial–temporal feature embeddings. Then, a curiosity mechanism based on distributional random network distillation (DRND) is proposed to enhance the exploratory capabilities of CAVs in driving environments. Additionally, a temporal integrated deep reinforcement learning (TI-DRL) model is developed, incorporating an auxiliary loss that integrates spatial–temporal information to improve the model’s ability to capture the spatial–temporal dependencies. Finally, a cooperation-aware reward function is constructed to further evaluate the performance of CAVs. Comprehensive experiments are conducted across three representative traffic scenarios to validate the proposed method. The results demonstrate that our proposed method outperforms the baselines in driving safety, efficiency, and model stability, highlighting the effectiveness of the core components and the generalization capability of the proposed method.</div></div>\",\"PeriodicalId\":54417,\"journal\":{\"name\":\"Transportation Research Part C-Emerging Technologies\",\"volume\":\"177 \",\"pages\":\"Article 105183\"},\"PeriodicalIF\":7.6000,\"publicationDate\":\"2025-06-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Transportation Research Part C-Emerging Technologies\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0968090X25001871\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"TRANSPORTATION SCIENCE & TECHNOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transportation Research Part C-Emerging Technologies","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0968090X25001871","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"TRANSPORTATION SCIENCE & TECHNOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

混合自主交通中网联与自动驾驶车辆的协同决策技术对于现代智能交通系统的发展至关重要。最近，图强化学习（GRL）方法通过利用基于图的技术在解决决策挑战方面取得了显著的成功。然而，现有的基于grl的研究在生成准确的特征嵌入以增强驾驶策略、深入探索驾驶环境以及高效训练模型方面面临着重大挑战。为了解决这些问题，本文提出了一种基于分布式好奇心机制的图转换器强化学习方法，以提高特征生成效率和环境探索能力，最终提高自动驾驶汽车的决策性能。首先，提出了一种改进的变换图卷积网络（ITransGCN），将图卷积网络（GCN）、旋转位置编码方法（ROPE）和时间先验注意机制相结合，增强序列建模能力，从而生成信息丰富的时空特征嵌入。然后，提出了一种基于分布式随机网络蒸馏（DRND）的好奇心机制，以增强自动驾驶汽车在驾驶环境中的探索能力。此外，还开发了一个时间集成深度强化学习（TI-DRL）模型，该模型包含了一个集成时空信息的辅助损失，以提高模型捕获时空依赖关系的能力。最后，构造了一个合作感知的奖励函数来进一步评价协同决策的绩效。通过三种典型交通场景的综合实验验证了该方法的有效性。结果表明，该方法在驾驶安全性、效率和模型稳定性方面优于基线方法，突出了核心组件的有效性和方法的泛化能力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Curiosity-driven reinforcement learning with graph transformers for decision-making in connected and autonomous vehicles

Cooperative decision-making technology for connected and autonomous vehicles (CAVs) in mixed autonomy traffic is critical for the advancement of modern intelligent transportation systems. Recently, graph reinforcement learning (GRL) approaches have shown remarkable success in addressing decision-making challenges by leveraging graph-based technologies. However, existing GRL-based research faces substantial challenges in generating accurate feature embeddings to enhance driving policies, thoroughly exploring the driving environment, and efficiently training models. To address these challenges, this paper proposes a graph transformer reinforcement learning method with a distributional curiosity mechanism to improve the feature generation efficiency and environment exploration, ultimately boosting the decision-making performance of CAVs. First, an improved transformed graph convolutional network (ITransGCN) is proposed, integrating graph convolutional network (GCN), rotary position encoding method (ROPE), and temporal prior attention mechanism to strengthen sequential modeling capabilities, thereby generating informative spatial–temporal feature embeddings. Then, a curiosity mechanism based on distributional random network distillation (DRND) is proposed to enhance the exploratory capabilities of CAVs in driving environments. Additionally, a temporal integrated deep reinforcement learning (TI-DRL) model is developed, incorporating an auxiliary loss that integrates spatial–temporal information to improve the model’s ability to capture the spatial–temporal dependencies. Finally, a cooperation-aware reward function is constructed to further evaluate the performance of CAVs. Comprehensive experiments are conducted across three representative traffic scenarios to validate the proposed method. The results demonstrate that our proposed method outperforms the baselines in driving safety, efficiency, and model stability, highlighting the effectiveness of the core components and the generalization capability of the proposed method.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Transportation Research Part C-Emerging Technologies 工程技术-运输科技

CiteScore

15.80

自引率

12.00%

发文量

332

审稿时长

64 days

期刊介绍： Transportation Research: Part C (TR_C) is dedicated to showcasing high-quality, scholarly research that delves into the development, applications, and implications of transportation systems and emerging technologies. Our focus lies not solely on individual technologies, but rather on their broader implications for the planning, design, operation, control, maintenance, and rehabilitation of transportation systems, services, and components. In essence, the intellectual core of the journal revolves around the transportation aspect rather than the technology itself. We actively encourage the integration of quantitative methods from diverse fields such as operations research, control systems, complex networks, computer science, and artificial intelligence. Join us in exploring the intersection of transportation systems and emerging technologies to drive innovation and progress in the field.