SGLPER：一种安全的端到端自动驾驶决策框架，通过优先体验回放和Gipps模型，将深度强化学习和专家演示相结合

IF 3.7 2区工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Displays Pub Date : 2025-04-07 DOI:10.1016/j.displa.2025.103041

Jianping Cui , Liang Yuan , Wendong Xiao , Teng Ran , Li He , Jianbo Zhang

{"title":"SGLPER：一种安全的端到端自动驾驶决策框架，通过优先体验回放和Gipps模型，将深度强化学习和专家演示相结合","authors":"Jianping Cui , Liang Yuan , Wendong Xiao , Teng Ran , Li He , Jianbo Zhang","doi":"10.1016/j.displa.2025.103041","DOIUrl":null,"url":null,"abstract":"<div><div>Despite significant advancements in deep reinforcement learning (DRL), existing methods for autonomous driving often need to overcome the cold-start problem, requiring extensive training to converge and fail to fully address safety concerns in dynamic driving environments. To address these limitations, we propose an efficient DRL framework, SGLPER, which integrates Prioritized Experience Replay (PER), expert demonstrations, and a safe speed calculation model to improve learning efficiency and decision-making safety. Specifically, PER mitigates the cold-start problem by prioritizing high-value experiences and accelerating training convergence. The Long Short-Term Memory (LSTM) method also captures spatiotemporal information from observed states, enabling the agent to make informed decisions based on past experiences in complex, dynamic traffic scenarios. The safety strategy incorporates the Gipps model, introducing relatively safe speed limits into the reinforcement learning (RL) process to enhance driving safety. Moreover, Kullback–Leibler (KL) divergence combines RL with expert demonstrations, enabling the agent to learn human-like driving behaviors effectively. Experimental results in two simulated driving scenarios validate the robustness and effectiveness of the proposed framework. Compared to traditional DRL methods, SGLPER demonstrates safer strategies, higher success rates, and faster convergence. This study presents a promising approach for developing safer, more efficient autonomous driving systems.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"88 ","pages":"Article 103041"},"PeriodicalIF":3.7000,"publicationDate":"2025-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SGLPER: A safe end-to-end autonomous driving decision framework combining deep reinforcement learning and expert demonstrations via prioritized experience replay and the Gipps model\",\"authors\":\"Jianping Cui , Liang Yuan , Wendong Xiao , Teng Ran , Li He , Jianbo Zhang\",\"doi\":\"10.1016/j.displa.2025.103041\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Despite significant advancements in deep reinforcement learning (DRL), existing methods for autonomous driving often need to overcome the cold-start problem, requiring extensive training to converge and fail to fully address safety concerns in dynamic driving environments. To address these limitations, we propose an efficient DRL framework, SGLPER, which integrates Prioritized Experience Replay (PER), expert demonstrations, and a safe speed calculation model to improve learning efficiency and decision-making safety. Specifically, PER mitigates the cold-start problem by prioritizing high-value experiences and accelerating training convergence. The Long Short-Term Memory (LSTM) method also captures spatiotemporal information from observed states, enabling the agent to make informed decisions based on past experiences in complex, dynamic traffic scenarios. The safety strategy incorporates the Gipps model, introducing relatively safe speed limits into the reinforcement learning (RL) process to enhance driving safety. Moreover, Kullback–Leibler (KL) divergence combines RL with expert demonstrations, enabling the agent to learn human-like driving behaviors effectively. Experimental results in two simulated driving scenarios validate the robustness and effectiveness of the proposed framework. Compared to traditional DRL methods, SGLPER demonstrates safer strategies, higher success rates, and faster convergence. This study presents a promising approach for developing safer, more efficient autonomous driving systems.</div></div>\",\"PeriodicalId\":50570,\"journal\":{\"name\":\"Displays\",\"volume\":\"88 \",\"pages\":\"Article 103041\"},\"PeriodicalIF\":3.7000,\"publicationDate\":\"2025-04-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Displays\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0141938225000782\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Displays","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0141938225000782","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

摘要

尽管深度强化学习（DRL）取得了重大进展，但现有的自动驾驶方法通常需要克服冷启动问题，需要大量的训练才能收敛，并且无法完全解决动态驾驶环境中的安全问题。为了解决这些限制，我们提出了一个高效的DRL框架，SGLPER，它集成了优先体验回放（PER）、专家演示和安全速度计算模型，以提高学习效率和决策安全性。具体来说，PER通过优先考虑高价值经验和加速训练收敛来缓解冷启动问题。长短期记忆（LSTM）方法还从观察状态中捕获时空信息，使智能体能够根据过去在复杂、动态交通场景中的经验做出明智的决策。安全策略结合了Gipps模型，在强化学习（RL）过程中引入了相对安全的速度限制，以提高驾驶安全性。此外，Kullback-Leibler （KL）发散将强化学习与专家演示相结合，使智能体能够有效地学习类人驾驶行为。两种模拟驾驶场景的实验结果验证了该框架的鲁棒性和有效性。与传统的DRL方法相比，SGLPER具有更安全的策略、更高的成功率和更快的收敛速度。这项研究为开发更安全、更高效的自动驾驶系统提供了一种有前途的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

SGLPER: A safe end-to-end autonomous driving decision framework combining deep reinforcement learning and expert demonstrations via prioritized experience replay and the Gipps model

Despite significant advancements in deep reinforcement learning (DRL), existing methods for autonomous driving often need to overcome the cold-start problem, requiring extensive training to converge and fail to fully address safety concerns in dynamic driving environments. To address these limitations, we propose an efficient DRL framework, SGLPER, which integrates Prioritized Experience Replay (PER), expert demonstrations, and a safe speed calculation model to improve learning efficiency and decision-making safety. Specifically, PER mitigates the cold-start problem by prioritizing high-value experiences and accelerating training convergence. The Long Short-Term Memory (LSTM) method also captures spatiotemporal information from observed states, enabling the agent to make informed decisions based on past experiences in complex, dynamic traffic scenarios. The safety strategy incorporates the Gipps model, introducing relatively safe speed limits into the reinforcement learning (RL) process to enhance driving safety. Moreover, Kullback–Leibler (KL) divergence combines RL with expert demonstrations, enabling the agent to learn human-like driving behaviors effectively. Experimental results in two simulated driving scenarios validate the robustness and effectiveness of the proposed framework. Compared to traditional DRL methods, SGLPER demonstrates safer strategies, higher success rates, and faster convergence. This study presents a promising approach for developing safer, more efficient autonomous driving systems.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Displays 工程技术-工程：电子与电气

CiteScore

4.60

自引率

25.60%

发文量

138

审稿时长

92 days

期刊介绍： Displays is the international journal covering the research and development of display technology, its effective presentation and perception of information, and applications and systems including display-human interface. Technical papers on practical developments in Displays technology provide an effective channel to promote greater understanding and cross-fertilization across the diverse disciplines of the Displays community. Original research papers solving ergonomics issues at the display-human interface advance effective presentation of information. Tutorial papers covering fundamentals intended for display technologies and human factor engineers new to the field will also occasionally featured.