Feng Wei , Jucheng Yang , Yuan Wang , Liang Lin , Haibin Zhang
{"title":"基于先验知识的多信息图卷积网络驾驶员困倦检测","authors":"Feng Wei , Jucheng Yang , Yuan Wang , Liang Lin , Haibin Zhang","doi":"10.1016/j.eswa.2025.127028","DOIUrl":null,"url":null,"abstract":"<div><div>Recently, driver drowsiness detection has received significant research attention, primarily due to the escalating number of road accidents caused by drowsiness driving. To tackle this issue, computer vision has been employed to detect drowsy states by analyzing drivers’ facial expressions. However, existing methods based on computer vision often rely on one or two facial regions (i.e., eyes, and mouth) to detect drowsiness. This approach leads to a failure to account for the driver individual differences. Moreover, although facial regions are highly structured, existing methods employ non-structural architectures to model the drowsiness feature space, lacking guidance from prior knowledge and resulting in the loss of high-level detail features. To this end, we propose a prior knowledge-guided multi-information graph convolutional network (MIGCN) to address these issues for driver drowsiness detection. Compared to driver drowsiness detection methods based on CNN models, the structural MIGCN can effectively learn spatial facial features, enhancing feature representation. Additionally, the core of the proposed MIGCN consists of three modules: the multi-source features extraction module (MSFE), the multi-information representation module (MIRM), and the multi-information GCN and fusion module (MIGCN-F). The MSFE extracts multi-source features from five prior knowledge and entire facial regions to enhance the drowsiness feature space. These prior knowledge provide detailed information, and the entire facial region offers global high-level information. The MIRM injects class, attention, and temporal information into the multi-source features and also provides mean features from the same class, resulting in enriched multi-information features. Together, the MSFE and MIRM address the issue of drivers individual differences. Additionally, we design a task-driven novel MIGCN-F module, whose nodes are composed of multi-information features. This design not only preserves the multi-information of each node but also models the spatial relationships among multi-information features, thereby extracting more discriminative features. Experimental results on the DROZY and UTA-RLDD datasets show that MIGCN achieves accuracy of 94.78% and 97.52%, respectively, outperforming state-of-the-art methods by 1.68% and 2.84%, thus demonstrating its effectiveness.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"275 ","pages":"Article 127028"},"PeriodicalIF":7.5000,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Prior knowledge-guided multi-information graph convolutional network for driver drowsiness detection\",\"authors\":\"Feng Wei , Jucheng Yang , Yuan Wang , Liang Lin , Haibin Zhang\",\"doi\":\"10.1016/j.eswa.2025.127028\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Recently, driver drowsiness detection has received significant research attention, primarily due to the escalating number of road accidents caused by drowsiness driving. To tackle this issue, computer vision has been employed to detect drowsy states by analyzing drivers’ facial expressions. However, existing methods based on computer vision often rely on one or two facial regions (i.e., eyes, and mouth) to detect drowsiness. This approach leads to a failure to account for the driver individual differences. Moreover, although facial regions are highly structured, existing methods employ non-structural architectures to model the drowsiness feature space, lacking guidance from prior knowledge and resulting in the loss of high-level detail features. To this end, we propose a prior knowledge-guided multi-information graph convolutional network (MIGCN) to address these issues for driver drowsiness detection. Compared to driver drowsiness detection methods based on CNN models, the structural MIGCN can effectively learn spatial facial features, enhancing feature representation. Additionally, the core of the proposed MIGCN consists of three modules: the multi-source features extraction module (MSFE), the multi-information representation module (MIRM), and the multi-information GCN and fusion module (MIGCN-F). The MSFE extracts multi-source features from five prior knowledge and entire facial regions to enhance the drowsiness feature space. These prior knowledge provide detailed information, and the entire facial region offers global high-level information. The MIRM injects class, attention, and temporal information into the multi-source features and also provides mean features from the same class, resulting in enriched multi-information features. Together, the MSFE and MIRM address the issue of drivers individual differences. Additionally, we design a task-driven novel MIGCN-F module, whose nodes are composed of multi-information features. This design not only preserves the multi-information of each node but also models the spatial relationships among multi-information features, thereby extracting more discriminative features. Experimental results on the DROZY and UTA-RLDD datasets show that MIGCN achieves accuracy of 94.78% and 97.52%, respectively, outperforming state-of-the-art methods by 1.68% and 2.84%, thus demonstrating its effectiveness.</div></div>\",\"PeriodicalId\":50461,\"journal\":{\"name\":\"Expert Systems with Applications\",\"volume\":\"275 \",\"pages\":\"Article 127028\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2025-03-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Expert Systems with Applications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0957417425006505\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425006505","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Prior knowledge-guided multi-information graph convolutional network for driver drowsiness detection
Recently, driver drowsiness detection has received significant research attention, primarily due to the escalating number of road accidents caused by drowsiness driving. To tackle this issue, computer vision has been employed to detect drowsy states by analyzing drivers’ facial expressions. However, existing methods based on computer vision often rely on one or two facial regions (i.e., eyes, and mouth) to detect drowsiness. This approach leads to a failure to account for the driver individual differences. Moreover, although facial regions are highly structured, existing methods employ non-structural architectures to model the drowsiness feature space, lacking guidance from prior knowledge and resulting in the loss of high-level detail features. To this end, we propose a prior knowledge-guided multi-information graph convolutional network (MIGCN) to address these issues for driver drowsiness detection. Compared to driver drowsiness detection methods based on CNN models, the structural MIGCN can effectively learn spatial facial features, enhancing feature representation. Additionally, the core of the proposed MIGCN consists of three modules: the multi-source features extraction module (MSFE), the multi-information representation module (MIRM), and the multi-information GCN and fusion module (MIGCN-F). The MSFE extracts multi-source features from five prior knowledge and entire facial regions to enhance the drowsiness feature space. These prior knowledge provide detailed information, and the entire facial region offers global high-level information. The MIRM injects class, attention, and temporal information into the multi-source features and also provides mean features from the same class, resulting in enriched multi-information features. Together, the MSFE and MIRM address the issue of drivers individual differences. Additionally, we design a task-driven novel MIGCN-F module, whose nodes are composed of multi-information features. This design not only preserves the multi-information of each node but also models the spatial relationships among multi-information features, thereby extracting more discriminative features. Experimental results on the DROZY and UTA-RLDD datasets show that MIGCN achieves accuracy of 94.78% and 97.52%, respectively, outperforming state-of-the-art methods by 1.68% and 2.84%, thus demonstrating its effectiveness.
期刊介绍:
Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.