Jie Fan, Xudong Zhang, Yuan Zou, Yuanyuan Li, Yingqun Liu, Wenjing Sun
{"title":"通过随机组合双 Q-learning 与 Transformer 编码器特征评估改进自动驾驶的策略训练","authors":"Jie Fan, Xudong Zhang, Yuan Zou, Yuanyuan Li, Yingqun Liu, Wenjing Sun","doi":"10.1016/j.asoc.2024.112386","DOIUrl":null,"url":null,"abstract":"<div><div>In the burgeoning field of autonomous driving, reinforcement learning (RL) has gained prominence for its adaptability and intelligent decision-making. However, conventional RL methods face challenges in efficiently extracting relevant features from high-dimensional inputs and maximizing the use of environment-agent interaction data. To surmount these obstacles, this paper introduces a novel RL-based approach that integrates randomized ensembled double Q-Learning (REDQ) with a Transformer encoder. The Transformer encoder’s attention mechanism is utilized to dynamically evaluate features according to their relevance in different driving scenarios. Simultaneously, the implementation of REDQ, characterized by a high update-to-data (UTD) ratio, enhances the utilization of interaction data during policy training. Especially, the incorporation of ensemble strategy and in-target minimization in REDQ significantly improves training stability, especially under high UTD conditions. Ablation studies indicate that the Transformer encoder exhibits significantly enhanced feature extraction capabilities compared to conventional network architectures, resulting in a 13.6% to 21.4% increase in success rate for the MetaDrive autonomous driving task. Additionally, when compared to standard RL methodologies, the proposed approach demonstrates a faster rate of reward acquisition and achieves a 67.5% to 69% improvement in success rate.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"167 ","pages":"Article 112386"},"PeriodicalIF":7.2000,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Improving policy training for autonomous driving through randomized ensembled double Q-learning with Transformer encoder feature evaluation\",\"authors\":\"Jie Fan, Xudong Zhang, Yuan Zou, Yuanyuan Li, Yingqun Liu, Wenjing Sun\",\"doi\":\"10.1016/j.asoc.2024.112386\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>In the burgeoning field of autonomous driving, reinforcement learning (RL) has gained prominence for its adaptability and intelligent decision-making. However, conventional RL methods face challenges in efficiently extracting relevant features from high-dimensional inputs and maximizing the use of environment-agent interaction data. To surmount these obstacles, this paper introduces a novel RL-based approach that integrates randomized ensembled double Q-Learning (REDQ) with a Transformer encoder. The Transformer encoder’s attention mechanism is utilized to dynamically evaluate features according to their relevance in different driving scenarios. Simultaneously, the implementation of REDQ, characterized by a high update-to-data (UTD) ratio, enhances the utilization of interaction data during policy training. Especially, the incorporation of ensemble strategy and in-target minimization in REDQ significantly improves training stability, especially under high UTD conditions. Ablation studies indicate that the Transformer encoder exhibits significantly enhanced feature extraction capabilities compared to conventional network architectures, resulting in a 13.6% to 21.4% increase in success rate for the MetaDrive autonomous driving task. Additionally, when compared to standard RL methodologies, the proposed approach demonstrates a faster rate of reward acquisition and achieves a 67.5% to 69% improvement in success rate.</div></div>\",\"PeriodicalId\":50737,\"journal\":{\"name\":\"Applied Soft Computing\",\"volume\":\"167 \",\"pages\":\"Article 112386\"},\"PeriodicalIF\":7.2000,\"publicationDate\":\"2024-10-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Soft Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1568494624011608\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Soft Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1568494624011608","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Improving policy training for autonomous driving through randomized ensembled double Q-learning with Transformer encoder feature evaluation
In the burgeoning field of autonomous driving, reinforcement learning (RL) has gained prominence for its adaptability and intelligent decision-making. However, conventional RL methods face challenges in efficiently extracting relevant features from high-dimensional inputs and maximizing the use of environment-agent interaction data. To surmount these obstacles, this paper introduces a novel RL-based approach that integrates randomized ensembled double Q-Learning (REDQ) with a Transformer encoder. The Transformer encoder’s attention mechanism is utilized to dynamically evaluate features according to their relevance in different driving scenarios. Simultaneously, the implementation of REDQ, characterized by a high update-to-data (UTD) ratio, enhances the utilization of interaction data during policy training. Especially, the incorporation of ensemble strategy and in-target minimization in REDQ significantly improves training stability, especially under high UTD conditions. Ablation studies indicate that the Transformer encoder exhibits significantly enhanced feature extraction capabilities compared to conventional network architectures, resulting in a 13.6% to 21.4% increase in success rate for the MetaDrive autonomous driving task. Additionally, when compared to standard RL methodologies, the proposed approach demonstrates a faster rate of reward acquisition and achieves a 67.5% to 69% improvement in success rate.
期刊介绍:
Applied Soft Computing is an international journal promoting an integrated view of soft computing to solve real life problems.The focus is to publish the highest quality research in application and convergence of the areas of Fuzzy Logic, Neural Networks, Evolutionary Computing, Rough Sets and other similar techniques to address real world complexities.
Applied Soft Computing is a rolling publication: articles are published as soon as the editor-in-chief has accepted them. Therefore, the web site will continuously be updated with new articles and the publication time will be short.