无人驾驶车辆在变道和超车过程中的连续决策:任务分解的风险意识强化学习方法

IF 14 1区 工程技术 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Sifan Wu;Daxin Tian;Xuting Duan;Jianshan Zhou;Dezong Zhao;Dongpu Cao
{"title":"无人驾驶车辆在变道和超车过程中的连续决策:任务分解的风险意识强化学习方法","authors":"Sifan Wu;Daxin Tian;Xuting Duan;Jianshan Zhou;Dezong Zhao;Dongpu Cao","doi":"10.1109/TIV.2024.3380074","DOIUrl":null,"url":null,"abstract":"Reinforcement learning methods have shown the ability to solve challenging scenarios in unmanned systems. However, solving long-time decision-making sequences in a highly complex environment, such as continuous lane change and overtaking in dense scenarios, remains challenging. Although existing unmanned vehicle systems have made considerable progress, minimizing driving risk is the first consideration. Risk-aware reinforcement learning is crucial for addressing potential driving risks. However, the variability of the risks posed by several risk sources is not considered by existing reinforcement learning algorithms applied in unmanned vehicles. Based on the above analysis, this study proposes a risk-aware reinforcement learning method with driving task decomposition to minimize the risk of various sources. Specifically, risk potential fields are constructed and combined with reinforcement learning to decompose the driving task. The proposed reinforcement learning framework uses different risk-branching networks to learn the driving task. Furthermore, a low-risk episodic sampling augmentation method for different risk branches is proposed to solve the shortage of high-quality samples and further improve sampling efficiency. Also, an intervention training strategy is employed wherein the artificial potential field (APF) is combined with reinforcement learning to speed up training and further ensure safety. Finally, the complete intervention risk classification twin delayed deep deterministic policy gradient-task decompose (IDRCTD3-TD) algorithm is proposed. Two scenarios with different difficulties are designed to validate the superiority of this framework. Results show that the proposed framework has remarkable improvements in performance.","PeriodicalId":36532,"journal":{"name":"IEEE Transactions on Intelligent Vehicles","volume":"9 4","pages":"4657-4674"},"PeriodicalIF":14.0000,"publicationDate":"2024-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Continuous Decision-Making in Lane Changing and Overtaking Maneuvers for Unmanned Vehicles: A Risk-Aware Reinforcement Learning Approach With Task Decomposition\",\"authors\":\"Sifan Wu;Daxin Tian;Xuting Duan;Jianshan Zhou;Dezong Zhao;Dongpu Cao\",\"doi\":\"10.1109/TIV.2024.3380074\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Reinforcement learning methods have shown the ability to solve challenging scenarios in unmanned systems. However, solving long-time decision-making sequences in a highly complex environment, such as continuous lane change and overtaking in dense scenarios, remains challenging. Although existing unmanned vehicle systems have made considerable progress, minimizing driving risk is the first consideration. Risk-aware reinforcement learning is crucial for addressing potential driving risks. However, the variability of the risks posed by several risk sources is not considered by existing reinforcement learning algorithms applied in unmanned vehicles. Based on the above analysis, this study proposes a risk-aware reinforcement learning method with driving task decomposition to minimize the risk of various sources. Specifically, risk potential fields are constructed and combined with reinforcement learning to decompose the driving task. The proposed reinforcement learning framework uses different risk-branching networks to learn the driving task. Furthermore, a low-risk episodic sampling augmentation method for different risk branches is proposed to solve the shortage of high-quality samples and further improve sampling efficiency. Also, an intervention training strategy is employed wherein the artificial potential field (APF) is combined with reinforcement learning to speed up training and further ensure safety. Finally, the complete intervention risk classification twin delayed deep deterministic policy gradient-task decompose (IDRCTD3-TD) algorithm is proposed. Two scenarios with different difficulties are designed to validate the superiority of this framework. Results show that the proposed framework has remarkable improvements in performance.\",\"PeriodicalId\":36532,\"journal\":{\"name\":\"IEEE Transactions on Intelligent Vehicles\",\"volume\":\"9 4\",\"pages\":\"4657-4674\"},\"PeriodicalIF\":14.0000,\"publicationDate\":\"2024-03-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Intelligent Vehicles\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10477452/\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Intelligent Vehicles","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10477452/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

强化学习方法已显示出解决无人驾驶系统中具有挑战性场景的能力。然而,在高度复杂的环境中解决长时间决策序列问题,如在密集场景中连续变道和超车,仍然具有挑战性。尽管现有的无人车系统已经取得了长足的进步,但驾驶风险最小化仍是首要考虑因素。风险意识强化学习对于解决潜在的驾驶风险至关重要。然而,应用于无人车的现有强化学习算法并未考虑多个风险源带来的风险的可变性。基于上述分析,本研究提出了一种具有驾驶任务分解功能的风险感知强化学习方法,以最大限度地降低各种来源的风险。具体来说,构建风险潜在场,并结合强化学习来分解驾驶任务。所提出的强化学习框架使用不同的风险分支网络来学习驾驶任务。此外,针对不同的风险分支,提出了一种低风险偶发采样增强方法,以解决高质量样本不足的问题,并进一步提高采样效率。同时,采用人工势场(APF)与强化学习相结合的干预训练策略,加快训练速度,进一步确保安全。最后,提出了完整的干预风险分类双延迟深度确定性策略梯度任务分解(IDRCTD3-TD)算法。为了验证该框架的优越性,设计了两个不同难度的场景。结果表明,所提出的框架在性能上有显著提高。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Continuous Decision-Making in Lane Changing and Overtaking Maneuvers for Unmanned Vehicles: A Risk-Aware Reinforcement Learning Approach With Task Decomposition
Reinforcement learning methods have shown the ability to solve challenging scenarios in unmanned systems. However, solving long-time decision-making sequences in a highly complex environment, such as continuous lane change and overtaking in dense scenarios, remains challenging. Although existing unmanned vehicle systems have made considerable progress, minimizing driving risk is the first consideration. Risk-aware reinforcement learning is crucial for addressing potential driving risks. However, the variability of the risks posed by several risk sources is not considered by existing reinforcement learning algorithms applied in unmanned vehicles. Based on the above analysis, this study proposes a risk-aware reinforcement learning method with driving task decomposition to minimize the risk of various sources. Specifically, risk potential fields are constructed and combined with reinforcement learning to decompose the driving task. The proposed reinforcement learning framework uses different risk-branching networks to learn the driving task. Furthermore, a low-risk episodic sampling augmentation method for different risk branches is proposed to solve the shortage of high-quality samples and further improve sampling efficiency. Also, an intervention training strategy is employed wherein the artificial potential field (APF) is combined with reinforcement learning to speed up training and further ensure safety. Finally, the complete intervention risk classification twin delayed deep deterministic policy gradient-task decompose (IDRCTD3-TD) algorithm is proposed. Two scenarios with different difficulties are designed to validate the superiority of this framework. Results show that the proposed framework has remarkable improvements in performance.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE Transactions on Intelligent Vehicles
IEEE Transactions on Intelligent Vehicles Mathematics-Control and Optimization
CiteScore
12.10
自引率
13.40%
发文量
177
期刊介绍: The IEEE Transactions on Intelligent Vehicles (T-IV) is a premier platform for publishing peer-reviewed articles that present innovative research concepts, application results, significant theoretical findings, and application case studies in the field of intelligent vehicles. With a particular emphasis on automated vehicles within roadway environments, T-IV aims to raise awareness of pressing research and application challenges. Our focus is on providing critical information to the intelligent vehicle community, serving as a dissemination vehicle for IEEE ITS Society members and others interested in learning about the state-of-the-art developments and progress in research and applications related to intelligent vehicles. Join us in advancing knowledge and innovation in this dynamic field.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信