Performance comparison of explainable DQN and DDPG models for cooperative lane change decision-making in multi-intelligent industrial IoT vehicles

IF 6 3区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
Hao-bai ZHAN
{"title":"Performance comparison of explainable DQN and DDPG models for cooperative lane change decision-making in multi-intelligent industrial IoT vehicles","authors":"Hao-bai ZHAN","doi":"10.1016/j.iot.2025.101552","DOIUrl":null,"url":null,"abstract":"<div><div>With the rapid advancement of intelligent connected vehicles (ICVs) technology, efficient and safe vehicular lane-changing decisions have become a focal point of interest for intelligent transportation systems (ITS). This paper investigates the application of explainable artificial intelligence (XAI) techniques to deep reinforcement learning algorithms, specifically deep Q-networks (DQN) and deep deterministic policy gradient (DDPG), for lane-changing decisions in industrial internet of things (IIoT) vehicles. By integrating innovative reward functions, the study assesses the performance differences between these models under various traffic densities and ICV counts in a three-lane highway scenario. The use of XAI feature representations enhances the transparency and interpretability of the models, providing insights into the decision-making process. XAI helps to elucidate how the models arrive at their decisions, improving trust and reliability in automated systems. The research reveals that although the DQN model demonstrates initial superior performance in the early phases of experimentation, the DDPG model outperforms in crucial performance metrics such as average fleet speed, headway, and stability during later stages of training. The DDPG model maintains better control over fleet speed and vehicle spacing in both low-density and high-density traffic environments, showcasing its superior adaptability and efficiency. These findings highlight the DDPG model's enhanced capability to manage dynamic and complex driving environments, attributed to its refined policy learning approach which adeptly balances exploration and exploitation. The novel reward function significantly promotes cooperative lane-changing behaviors among ICVs, optimizing lane change decisions and improving overall traffic flow efficiency. This study not only provides valuable technical support for lane-changing decisions in smart vehicular networks but also lays a theoretical and empirical foundation for the advancement of future ITS. The insights gained from comparing DQN and DDPG models contribute to the ongoing discussion on effective deep learning strategies for real-world ITS applications, potentially guiding future developments in autonomous driving technologies.</div></div>","PeriodicalId":29968,"journal":{"name":"Internet of Things","volume":"31 ","pages":"Article 101552"},"PeriodicalIF":6.0000,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Internet of Things","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2542660525000654","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

With the rapid advancement of intelligent connected vehicles (ICVs) technology, efficient and safe vehicular lane-changing decisions have become a focal point of interest for intelligent transportation systems (ITS). This paper investigates the application of explainable artificial intelligence (XAI) techniques to deep reinforcement learning algorithms, specifically deep Q-networks (DQN) and deep deterministic policy gradient (DDPG), for lane-changing decisions in industrial internet of things (IIoT) vehicles. By integrating innovative reward functions, the study assesses the performance differences between these models under various traffic densities and ICV counts in a three-lane highway scenario. The use of XAI feature representations enhances the transparency and interpretability of the models, providing insights into the decision-making process. XAI helps to elucidate how the models arrive at their decisions, improving trust and reliability in automated systems. The research reveals that although the DQN model demonstrates initial superior performance in the early phases of experimentation, the DDPG model outperforms in crucial performance metrics such as average fleet speed, headway, and stability during later stages of training. The DDPG model maintains better control over fleet speed and vehicle spacing in both low-density and high-density traffic environments, showcasing its superior adaptability and efficiency. These findings highlight the DDPG model's enhanced capability to manage dynamic and complex driving environments, attributed to its refined policy learning approach which adeptly balances exploration and exploitation. The novel reward function significantly promotes cooperative lane-changing behaviors among ICVs, optimizing lane change decisions and improving overall traffic flow efficiency. This study not only provides valuable technical support for lane-changing decisions in smart vehicular networks but also lays a theoretical and empirical foundation for the advancement of future ITS. The insights gained from comparing DQN and DDPG models contribute to the ongoing discussion on effective deep learning strategies for real-world ITS applications, potentially guiding future developments in autonomous driving technologies.
多智能工业物联网车辆协同变道决策中可解释DQN和DDPG模型的性能比较
随着智能网联汽车(ICVs)技术的快速发展,高效、安全的车辆变道决策已成为智能交通系统(ITS)关注的焦点。本文研究了可解释人工智能(XAI)技术在深度强化学习算法中的应用,特别是深度q -网络(DQN)和深度确定性策略梯度(DDPG),用于工业物联网(IIoT)车辆的变道决策。通过整合创新的奖励函数,研究评估了三车道高速公路场景下不同交通密度和ICV数下这些模型的性能差异。XAI特征表示的使用增强了模型的透明性和可解释性,提供了对决策过程的洞察。XAI有助于阐明模型如何做出决策,从而提高自动化系统中的信任和可靠性。研究表明,尽管DQN模型在实验的早期阶段表现出最初的优越性能,但DDPG模型在训练后期的关键性能指标(如平均车队速度、车头距和稳定性)上表现优于DQN模型。无论在低密度和高密度交通环境下,DDPG模型对车队速度和车辆间距都保持了较好的控制,显示出优越的适应性和效率。这些发现突出了DDPG模型在管理动态和复杂驾驶环境方面的增强能力,这归功于其完善的政策学习方法,该方法熟练地平衡了探索和开发。该奖励函数能显著促进icv间的合作变道行为,优化变道决策,提高整体交通流效率。本研究不仅为智能车联网的变道决策提供了有价值的技术支持,也为未来ITS的发展奠定了理论和实证基础。通过比较DQN和DDPG模型获得的见解有助于对现实世界ITS应用中有效的深度学习策略的持续讨论,可能指导未来自动驾驶技术的发展。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Internet of Things
Internet of Things Multiple-
CiteScore
3.60
自引率
5.10%
发文量
115
审稿时长
37 days
期刊介绍: Internet of Things; Engineering Cyber Physical Human Systems is a comprehensive journal encouraging cross collaboration between researchers, engineers and practitioners in the field of IoT & Cyber Physical Human Systems. The journal offers a unique platform to exchange scientific information on the entire breadth of technology, science, and societal applications of the IoT. The journal will place a high priority on timely publication, and provide a home for high quality. Furthermore, IOT is interested in publishing topical Special Issues on any aspect of IOT.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信