{"title":"基于Vision Transformer的人机协同驾驶分心驾驶识别","authors":"Huiqin Chen, Hao Liu, Xiexing Feng, Hailong Chen","doi":"10.1109/CVCI54083.2021.9661254","DOIUrl":null,"url":null,"abstract":"In order to achieve human-machine co-driving, the accurate and timely recognition of driving behavior is the first problem that needs to be solved. Numerous traffic accidents are caused by distracted driving behaviors, leading to the study of distracted driving recognition as an important topic in the traffic field. To overcome the shortcomings of existing researches, such as the low accuracy due to the insufficient data or the poor real-time performance due to lengthy layers of deep neural networks, we proposed a distracted driving recognition model based on the finetuned Vision Transformer, called DDR-ViT-finetuned. The model was trained and tested on the State Farm dataset compared to other methods. The experimental results demonstrated that the novel model achieved the highest accuracy rate of 97.5%.","PeriodicalId":419836,"journal":{"name":"2021 5th CAA International Conference on Vehicular Control and Intelligence (CVCI)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Distracted driving recognition using Vision Transformer for human-machine co-driving\",\"authors\":\"Huiqin Chen, Hao Liu, Xiexing Feng, Hailong Chen\",\"doi\":\"10.1109/CVCI54083.2021.9661254\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In order to achieve human-machine co-driving, the accurate and timely recognition of driving behavior is the first problem that needs to be solved. Numerous traffic accidents are caused by distracted driving behaviors, leading to the study of distracted driving recognition as an important topic in the traffic field. To overcome the shortcomings of existing researches, such as the low accuracy due to the insufficient data or the poor real-time performance due to lengthy layers of deep neural networks, we proposed a distracted driving recognition model based on the finetuned Vision Transformer, called DDR-ViT-finetuned. The model was trained and tested on the State Farm dataset compared to other methods. The experimental results demonstrated that the novel model achieved the highest accuracy rate of 97.5%.\",\"PeriodicalId\":419836,\"journal\":{\"name\":\"2021 5th CAA International Conference on Vehicular Control and Intelligence (CVCI)\",\"volume\":\"46 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 5th CAA International Conference on Vehicular Control and Intelligence (CVCI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CVCI54083.2021.9661254\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 5th CAA International Conference on Vehicular Control and Intelligence (CVCI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVCI54083.2021.9661254","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
摘要
为了实现人机协同驾驶,对驾驶行为的准确、及时的识别是首先需要解决的问题。许多交通事故都是由分心驾驶行为引起的,因此对分心驾驶识别的研究成为交通领域的一个重要课题。针对现有研究存在的数据不足导致识别精度低、深度神经网络层数过长导致实时性差等问题,提出了一种基于微调视觉变压器的分心驾驶识别模型,称为ddr - viti -finetune。与其他方法相比,该模型在State Farm数据集上进行了训练和测试。实验结果表明,该模型达到了97.5%的最高准确率。
Distracted driving recognition using Vision Transformer for human-machine co-driving
In order to achieve human-machine co-driving, the accurate and timely recognition of driving behavior is the first problem that needs to be solved. Numerous traffic accidents are caused by distracted driving behaviors, leading to the study of distracted driving recognition as an important topic in the traffic field. To overcome the shortcomings of existing researches, such as the low accuracy due to the insufficient data or the poor real-time performance due to lengthy layers of deep neural networks, we proposed a distracted driving recognition model based on the finetuned Vision Transformer, called DDR-ViT-finetuned. The model was trained and tested on the State Farm dataset compared to other methods. The experimental results demonstrated that the novel model achieved the highest accuracy rate of 97.5%.