基于先验知识的多信息图卷积网络驾驶员困倦检测

IF 7.5 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Expert Systems with Applications Pub Date : 2025-03-03 DOI:10.1016/j.eswa.2025.127028

Feng Wei , Jucheng Yang , Yuan Wang , Liang Lin , Haibin Zhang

{"title":"基于先验知识的多信息图卷积网络驾驶员困倦检测","authors":"Feng Wei , Jucheng Yang , Yuan Wang , Liang Lin , Haibin Zhang","doi":"10.1016/j.eswa.2025.127028","DOIUrl":null,"url":null,"abstract":"<div><div>Recently, driver drowsiness detection has received significant research attention, primarily due to the escalating number of road accidents caused by drowsiness driving. To tackle this issue, computer vision has been employed to detect drowsy states by analyzing drivers’ facial expressions. However, existing methods based on computer vision often rely on one or two facial regions (i.e., eyes, and mouth) to detect drowsiness. This approach leads to a failure to account for the driver individual differences. Moreover, although facial regions are highly structured, existing methods employ non-structural architectures to model the drowsiness feature space, lacking guidance from prior knowledge and resulting in the loss of high-level detail features. To this end, we propose a prior knowledge-guided multi-information graph convolutional network (MIGCN) to address these issues for driver drowsiness detection. Compared to driver drowsiness detection methods based on CNN models, the structural MIGCN can effectively learn spatial facial features, enhancing feature representation. Additionally, the core of the proposed MIGCN consists of three modules: the multi-source features extraction module (MSFE), the multi-information representation module (MIRM), and the multi-information GCN and fusion module (MIGCN-F). The MSFE extracts multi-source features from five prior knowledge and entire facial regions to enhance the drowsiness feature space. These prior knowledge provide detailed information, and the entire facial region offers global high-level information. The MIRM injects class, attention, and temporal information into the multi-source features and also provides mean features from the same class, resulting in enriched multi-information features. Together, the MSFE and MIRM address the issue of drivers individual differences. Additionally, we design a task-driven novel MIGCN-F module, whose nodes are composed of multi-information features. This design not only preserves the multi-information of each node but also models the spatial relationships among multi-information features, thereby extracting more discriminative features. Experimental results on the DROZY and UTA-RLDD datasets show that MIGCN achieves accuracy of 94.78% and 97.52%, respectively, outperforming state-of-the-art methods by 1.68% and 2.84%, thus demonstrating its effectiveness.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"275 ","pages":"Article 127028"},"PeriodicalIF":7.5000,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Prior knowledge-guided multi-information graph convolutional network for driver drowsiness detection\",\"authors\":\"Feng Wei , Jucheng Yang , Yuan Wang , Liang Lin , Haibin Zhang\",\"doi\":\"10.1016/j.eswa.2025.127028\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Recently, driver drowsiness detection has received significant research attention, primarily due to the escalating number of road accidents caused by drowsiness driving. To tackle this issue, computer vision has been employed to detect drowsy states by analyzing drivers’ facial expressions. However, existing methods based on computer vision often rely on one or two facial regions (i.e., eyes, and mouth) to detect drowsiness. This approach leads to a failure to account for the driver individual differences. Moreover, although facial regions are highly structured, existing methods employ non-structural architectures to model the drowsiness feature space, lacking guidance from prior knowledge and resulting in the loss of high-level detail features. To this end, we propose a prior knowledge-guided multi-information graph convolutional network (MIGCN) to address these issues for driver drowsiness detection. Compared to driver drowsiness detection methods based on CNN models, the structural MIGCN can effectively learn spatial facial features, enhancing feature representation. Additionally, the core of the proposed MIGCN consists of three modules: the multi-source features extraction module (MSFE), the multi-information representation module (MIRM), and the multi-information GCN and fusion module (MIGCN-F). The MSFE extracts multi-source features from five prior knowledge and entire facial regions to enhance the drowsiness feature space. These prior knowledge provide detailed information, and the entire facial region offers global high-level information. The MIRM injects class, attention, and temporal information into the multi-source features and also provides mean features from the same class, resulting in enriched multi-information features. Together, the MSFE and MIRM address the issue of drivers individual differences. Additionally, we design a task-driven novel MIGCN-F module, whose nodes are composed of multi-information features. This design not only preserves the multi-information of each node but also models the spatial relationships among multi-information features, thereby extracting more discriminative features. Experimental results on the DROZY and UTA-RLDD datasets show that MIGCN achieves accuracy of 94.78% and 97.52%, respectively, outperforming state-of-the-art methods by 1.68% and 2.84%, thus demonstrating its effectiveness.</div></div>\",\"PeriodicalId\":50461,\"journal\":{\"name\":\"Expert Systems with Applications\",\"volume\":\"275 \",\"pages\":\"Article 127028\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2025-03-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Expert Systems with Applications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0957417425006505\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425006505","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

近年来，驾驶员困倦检测受到了极大的研究关注，主要原因是由困倦驾驶引起的交通事故数量不断增加。为了解决这个问题，计算机视觉已经被用来通过分析司机的面部表情来检测昏昏欲睡的状态。然而，现有的基于计算机视觉的方法通常依赖于一个或两个面部区域（即眼睛和嘴巴）来检测睡意。这种方法导致无法解释驱动因素的个体差异。此外，尽管面部区域是高度结构化的，但现有方法采用非结构化架构来建模困倦特征空间，缺乏先验知识的指导，导致丢失高级细节特征。为此，我们提出了一种先验知识引导的多信息图卷积网络（MIGCN）来解决驾驶员困倦检测的这些问题。与基于CNN模型的驾驶员困倦检测方法相比，结构MIGCN可以有效地学习空间面部特征，增强特征表征。此外，MIGCN的核心由三个模块组成：多源特征提取模块（MSFE）、多信息表示模块（MIRM）和多信息GCN与融合模块（MIGCN- f）。MSFE从5个先验知识和整个面部区域中提取多源特征，增强困倦特征空间。这些先验知识提供了详细的信息，而整个面部区域提供了全局的高级信息。MIRM在多源特征中注入类别、注意力和时间信息，并提供同一类别的平均特征，从而丰富了多信息特征。MSFE和MIRM共同解决了驱动程序个体差异的问题。此外，我们还设计了一个任务驱动的新型MIGCN-F模块，该模块的节点由多信息特征组成。该设计既保留了每个节点的多信息，又对多信息特征之间的空间关系进行建模，从而提取出更多的判别特征。在DROZY和UTA-RLDD数据集上的实验结果表明，MIGCN的准确率分别为94.78%和97.52%，比现有方法分别高出1.68%和2.84%，证明了其有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Prior knowledge-guided multi-information graph convolutional network for driver drowsiness detection

Recently, driver drowsiness detection has received significant research attention, primarily due to the escalating number of road accidents caused by drowsiness driving. To tackle this issue, computer vision has been employed to detect drowsy states by analyzing drivers’ facial expressions. However, existing methods based on computer vision often rely on one or two facial regions (i.e., eyes, and mouth) to detect drowsiness. This approach leads to a failure to account for the driver individual differences. Moreover, although facial regions are highly structured, existing methods employ non-structural architectures to model the drowsiness feature space, lacking guidance from prior knowledge and resulting in the loss of high-level detail features. To this end, we propose a prior knowledge-guided multi-information graph convolutional network (MIGCN) to address these issues for driver drowsiness detection. Compared to driver drowsiness detection methods based on CNN models, the structural MIGCN can effectively learn spatial facial features, enhancing feature representation. Additionally, the core of the proposed MIGCN consists of three modules: the multi-source features extraction module (MSFE), the multi-information representation module (MIRM), and the multi-information GCN and fusion module (MIGCN-F). The MSFE extracts multi-source features from five prior knowledge and entire facial regions to enhance the drowsiness feature space. These prior knowledge provide detailed information, and the entire facial region offers global high-level information. The MIRM injects class, attention, and temporal information into the multi-source features and also provides mean features from the same class, resulting in enriched multi-information features. Together, the MSFE and MIRM address the issue of drivers individual differences. Additionally, we design a task-driven novel MIGCN-F module, whose nodes are composed of multi-information features. This design not only preserves the multi-information of each node but also models the spatial relationships among multi-information features, thereby extracting more discriminative features. Experimental results on the DROZY and UTA-RLDD datasets show that MIGCN achieves accuracy of 94.78% and 97.52%, respectively, outperforming state-of-the-art methods by 1.68% and 2.84%, thus demonstrating its effectiveness.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Expert Systems with Applications 工程技术-工程：电子与电气

CiteScore

13.80

自引率

10.60%

发文量

2045

审稿时长

8.7 months

期刊介绍： Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.