基于深度最大熵反强化学习的人类驾驶行为建模框架

IF 2.8 3区 物理与天体物理 Q2 PHYSICS, MULTIDISCIPLINARY
{"title":"基于深度最大熵反强化学习的人类驾驶行为建模框架","authors":"","doi":"10.1016/j.physa.2024.130052","DOIUrl":null,"url":null,"abstract":"<div><p>Driving behavior modeling is extremely crucial for designing safe, intelligent, and personalized autonomous driving systems. In this paper, a modeling framework based on Markov Decision Processes (MDPs) is introduced that emulates drivers’ decision-making processes. The framework combines the Deep Maximum Entropy Inverse Reinforcement Learning (Deep MEIRL) and a reinforcement learning algorithm-proximal strategy optimization (PPO). A neural network structure is customized for Deep MEIRL, which uses the velocity of the ego vehicle, the pedestrian position, the velocity of surrounding vehicles, the lateral distance, the surrounding vehicles’ type, and the distance to the crosswalk to recover the nonlinear reward function. The dataset of drone-based video footage is collected in Xi’an (China) to train and validate the framework. The outcomes demonstrate that Deep MEIRL-PPO outperforms traditional modeling frameworks (Maximum Entropy Inverse Reinforcement Learning (MEIRL) - PPO) in modeling and predicting human driving behavior. Specifically, in predicting human driving behavior, Deep MEIRL-PPO outperforms MEIRL-PPO by 50.71% and 43.90% on the basis of the MAE and HD, respectively. Furthermore, it is discovered that Deep MEIRL-PPO accurately learns the behavior of human drivers avoiding potential conflicts when lines of sight are occluded. This research can contribute to aiding self-driving vehicles in learning human driving behavior and avoiding unforeseen risks.</p></div>","PeriodicalId":20152,"journal":{"name":"Physica A: Statistical Mechanics and its Applications","volume":null,"pages":null},"PeriodicalIF":2.8000,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Modeling framework of human driving behavior based on Deep Maximum Entropy Inverse Reinforcement Learning\",\"authors\":\"\",\"doi\":\"10.1016/j.physa.2024.130052\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Driving behavior modeling is extremely crucial for designing safe, intelligent, and personalized autonomous driving systems. In this paper, a modeling framework based on Markov Decision Processes (MDPs) is introduced that emulates drivers’ decision-making processes. The framework combines the Deep Maximum Entropy Inverse Reinforcement Learning (Deep MEIRL) and a reinforcement learning algorithm-proximal strategy optimization (PPO). A neural network structure is customized for Deep MEIRL, which uses the velocity of the ego vehicle, the pedestrian position, the velocity of surrounding vehicles, the lateral distance, the surrounding vehicles’ type, and the distance to the crosswalk to recover the nonlinear reward function. The dataset of drone-based video footage is collected in Xi’an (China) to train and validate the framework. The outcomes demonstrate that Deep MEIRL-PPO outperforms traditional modeling frameworks (Maximum Entropy Inverse Reinforcement Learning (MEIRL) - PPO) in modeling and predicting human driving behavior. Specifically, in predicting human driving behavior, Deep MEIRL-PPO outperforms MEIRL-PPO by 50.71% and 43.90% on the basis of the MAE and HD, respectively. Furthermore, it is discovered that Deep MEIRL-PPO accurately learns the behavior of human drivers avoiding potential conflicts when lines of sight are occluded. This research can contribute to aiding self-driving vehicles in learning human driving behavior and avoiding unforeseen risks.</p></div>\",\"PeriodicalId\":20152,\"journal\":{\"name\":\"Physica A: Statistical Mechanics and its Applications\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2024-08-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Physica A: Statistical Mechanics and its Applications\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0378437124005612\",\"RegionNum\":3,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"PHYSICS, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Physica A: Statistical Mechanics and its Applications","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0378437124005612","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PHYSICS, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

摘要

驾驶行为建模对于设计安全、智能和个性化的自动驾驶系统至关重要。本文介绍了一种基于马尔可夫决策过程(Markov Decision Processes,MDPs)的建模框架,它可以模拟驾驶员的决策过程。该框架结合了深度最大熵反强化学习(Deep MEIRL)和强化学习算法--最优策略优化(PPO)。深度最大熵反向强化学习(Deep MEIRL)是一种为深度最大熵反向强化学习定制的神经网络结构,它使用自我车辆的速度、行人的位置、周围车辆的速度、横向距离、周围车辆的类型以及到人行横道的距离来恢复非线性奖励函数。我们在中国西安收集了基于无人机的视频数据集,对该框架进行了训练和验证。结果表明,在模拟和预测人类驾驶行为方面,Deep MEIRL-PPO 优于传统建模框架(最大熵反强化学习(MEIRL)- PPO)。具体而言,在预测人类驾驶行为方面,基于 MAE 和 HD,Deep MEIRL-PPO 分别比 MEIRL-PPO 高出 50.71% 和 43.90%。此外,研究还发现,当视线被遮挡时,Deep MEIRL-PPO 能准确地学习人类驾驶员避免潜在冲突的行为。这项研究有助于帮助自动驾驶汽车学习人类驾驶行为,避免不可预见的风险。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Modeling framework of human driving behavior based on Deep Maximum Entropy Inverse Reinforcement Learning

Driving behavior modeling is extremely crucial for designing safe, intelligent, and personalized autonomous driving systems. In this paper, a modeling framework based on Markov Decision Processes (MDPs) is introduced that emulates drivers’ decision-making processes. The framework combines the Deep Maximum Entropy Inverse Reinforcement Learning (Deep MEIRL) and a reinforcement learning algorithm-proximal strategy optimization (PPO). A neural network structure is customized for Deep MEIRL, which uses the velocity of the ego vehicle, the pedestrian position, the velocity of surrounding vehicles, the lateral distance, the surrounding vehicles’ type, and the distance to the crosswalk to recover the nonlinear reward function. The dataset of drone-based video footage is collected in Xi’an (China) to train and validate the framework. The outcomes demonstrate that Deep MEIRL-PPO outperforms traditional modeling frameworks (Maximum Entropy Inverse Reinforcement Learning (MEIRL) - PPO) in modeling and predicting human driving behavior. Specifically, in predicting human driving behavior, Deep MEIRL-PPO outperforms MEIRL-PPO by 50.71% and 43.90% on the basis of the MAE and HD, respectively. Furthermore, it is discovered that Deep MEIRL-PPO accurately learns the behavior of human drivers avoiding potential conflicts when lines of sight are occluded. This research can contribute to aiding self-driving vehicles in learning human driving behavior and avoiding unforeseen risks.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
7.20
自引率
9.10%
发文量
852
审稿时长
6.6 months
期刊介绍: Physica A: Statistical Mechanics and its Applications Recognized by the European Physical Society Physica A publishes research in the field of statistical mechanics and its applications. Statistical mechanics sets out to explain the behaviour of macroscopic systems by studying the statistical properties of their microscopic constituents. Applications of the techniques of statistical mechanics are widespread, and include: applications to physical systems such as solids, liquids and gases; applications to chemical and biological systems (colloids, interfaces, complex fluids, polymers and biopolymers, cell physics); and other interdisciplinary applications to for instance biological, economical and sociological systems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信