用于驱动行为预测的蒸馏路由变压器

IF 0.6 Q4 TRANSPORTATION SCIENCE & TECHNOLOGY

SAE International Journal of Transportation Safety Pub Date : 2023-10-10 DOI:10.4271/09-12-01-0003

Jun Gao, Jiangang Yi, Yi Lu Murphey

{"title":"用于驱动行为预测的蒸馏路由变压器","authors":"Jun Gao, Jiangang Yi, Yi Lu Murphey","doi":"10.4271/09-12-01-0003","DOIUrl":null,"url":null,"abstract":"<div>The uncertainty of a driver’s state, the variability of the traffic environment, and the complexity of road conditions have made driving behavior a critical factor affecting traffic safety. Accurate predicting of driving behavior is therefore crucial for ensuring safe driving. In this research, an efficient framework, distilled routing transformer (DRTR), is proposed for driving behavior prediction using multiple modality data, i.e., front view video frames and vehicle signals. First, a cross-modal attention distiller is introduced, which distills the cross-modal attention knowledge of a fusion-encoder transformer to guide the training of our DRTR and learn deep interactions between different modalities. Second, since the multi-modal learning usually requires information from the macro view to the micro view, a self-attention (SA)-routing module is custom-designed for SA layers in DRTR for dynamic scheduling of global and local attentions for each input instance. Finally, a Mogrifier long short-term memory (Mogrifier LSTM) network is employed for DRTR to predict driving behaviors. We applied our approach to real-world data collected during drives in both urban and freeway environments by an instrumented vehicle. The experimental results demonstrate that the DRTR can predict the imminent driving behavior effectively while enjoying a faster inference speed than other state-of-the-art (SOTA) baselines.</div>","PeriodicalId":42847,"journal":{"name":"SAE International Journal of Transportation Safety","volume":"2015 1","pages":"0"},"PeriodicalIF":0.6000,"publicationDate":"2023-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Distilled Routing Transformer for Driving Behavior Prediction\",\"authors\":\"Jun Gao, Jiangang Yi, Yi Lu Murphey\",\"doi\":\"10.4271/09-12-01-0003\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>The uncertainty of a driver’s state, the variability of the traffic environment, and the complexity of road conditions have made driving behavior a critical factor affecting traffic safety. Accurate predicting of driving behavior is therefore crucial for ensuring safe driving. In this research, an efficient framework, distilled routing transformer (DRTR), is proposed for driving behavior prediction using multiple modality data, i.e., front view video frames and vehicle signals. First, a cross-modal attention distiller is introduced, which distills the cross-modal attention knowledge of a fusion-encoder transformer to guide the training of our DRTR and learn deep interactions between different modalities. Second, since the multi-modal learning usually requires information from the macro view to the micro view, a self-attention (SA)-routing module is custom-designed for SA layers in DRTR for dynamic scheduling of global and local attentions for each input instance. Finally, a Mogrifier long short-term memory (Mogrifier LSTM) network is employed for DRTR to predict driving behaviors. We applied our approach to real-world data collected during drives in both urban and freeway environments by an instrumented vehicle. The experimental results demonstrate that the DRTR can predict the imminent driving behavior effectively while enjoying a faster inference speed than other state-of-the-art (SOTA) baselines.</div>\",\"PeriodicalId\":42847,\"journal\":{\"name\":\"SAE International Journal of Transportation Safety\",\"volume\":\"2015 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.6000,\"publicationDate\":\"2023-10-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"SAE International Journal of Transportation Safety\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4271/09-12-01-0003\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"TRANSPORTATION SCIENCE & TECHNOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"SAE International Journal of Transportation Safety","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4271/09-12-01-0003","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"TRANSPORTATION SCIENCE & TECHNOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

驾驶员状态的不确定性、交通环境的多变性以及道路条件的复杂性，使得驾驶行为成为影响交通安全的关键因素。因此，准确预测驾驶行为对于确保安全驾驶至关重要。在本研究中，提出了一种高效的框架，即提取路由变压器(DRTR)，用于使用多模态数据(即前视视频帧和车辆信号)进行驾驶行为预测。首先，介绍了一个跨模态注意提取器，它提取融合编码器转换器的跨模态注意知识，以指导我们的DRTR训练并学习不同模态之间的深度交互。其次，由于多模态学习通常需要从宏观视图到微观视图的信息，因此在DRTR中为SA层定制了自关注路由模块，用于动态调度每个输入实例的全局和局部关注。最后，将Mogrifier长短期记忆(Mogrifier LSTM)网络用于DRTR预测驾驶行为。我们将我们的方法应用于一辆仪表化车辆在城市和高速公路环境中行驶时收集的真实数据。实验结果表明，DRTR可以有效地预测即将发生的驾驶行为，并且具有比其他最先进(SOTA)基线更快的推理速度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Distilled Routing Transformer for Driving Behavior Prediction

The uncertainty of a driver’s state, the variability of the traffic environment, and the complexity of road conditions have made driving behavior a critical factor affecting traffic safety. Accurate predicting of driving behavior is therefore crucial for ensuring safe driving. In this research, an efficient framework, distilled routing transformer (DRTR), is proposed for driving behavior prediction using multiple modality data, i.e., front view video frames and vehicle signals. First, a cross-modal attention distiller is introduced, which distills the cross-modal attention knowledge of a fusion-encoder transformer to guide the training of our DRTR and learn deep interactions between different modalities. Second, since the multi-modal learning usually requires information from the macro view to the micro view, a self-attention (SA)-routing module is custom-designed for SA layers in DRTR for dynamic scheduling of global and local attentions for each input instance. Finally, a Mogrifier long short-term memory (Mogrifier LSTM) network is employed for DRTR to predict driving behaviors. We applied our approach to real-world data collected during drives in both urban and freeway environments by an instrumented vehicle. The experimental results demonstrate that the DRTR can predict the imminent driving behavior effectively while enjoying a faster inference speed than other state-of-the-art (SOTA) baselines.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

SAE International Journal of Transportation Safety TRANSPORTATION SCIENCE & TECHNOLOGY-

CiteScore

1.10

自引率

0.00%

发文量