AI-enabled driver assistance: monitoring head and gaze movements for enhanced safety

IF 4.6 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Complex & Intelligent Systems Pub Date : 2025-05-19 DOI:10.1007/s40747-025-01897-7

Sayyed Mudassar Shah, Gan Zengkang, Zhaoyun Sun, Tariq Hussain, Khalid Zaman, Abdullah Alwabli, Amar Y. Jaffar, Farman Ali

{"title":"AI-enabled driver assistance: monitoring head and gaze movements for enhanced safety","authors":"Sayyed Mudassar Shah, Gan Zengkang, Zhaoyun Sun, Tariq Hussain, Khalid Zaman, Abdullah Alwabli, Amar Y. Jaffar, Farman Ali","doi":"10.1007/s40747-025-01897-7","DOIUrl":null,"url":null,"abstract":"<p>This paper introduces a real-time head-pose detection and eye-gaze estimation system for Automatic Driver Assistance Technology (ADAT) aimed at enhancing driver safety by accurately collecting and transmitting data on the driver’s head position and eye gaze to mitigate potential risks. Existing methods are constrained by significant limitations, including reduced accuracy under challenging conditions such as varying head orientations and lighting, higher latency in real-time applications (e.g., Faster-RCNN and TPH-YOLOv5), and computational inefficiency, which hinders their deployment in resource-constrained environments. To address these challenges, we propose a novel framework using the Transformer Detection of Gaze Head - YOLOv7 (TDGH-YOLOv7) object detector. The key contributions of this work include the development of a reference image dataset encompassing diverse vertical and horizontal gaze positions alongside the implementation of an optimized detection system that achieves state-of-the-art performance in terms of accuracy and latency. The proposed system achieves superior precision, with a weighted accuracy of 95.02% and Root Mean Square Errors of 2.23 and 1.68 for vertical and horizontal gaze estimation, respectively, validated on the MPII-Gaze and DG-Unicamp datasets. A comprehensive comparative analysis with existing models, such as CNN, SSD, Faster-RCNN, and YOLOv8, underscores the robustness and efficiency of the proposed approach. Finally, the implications of these findings are discussed, and potential avenues for future research are outlined.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"133 1","pages":""},"PeriodicalIF":4.6000,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Complex & Intelligent Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s40747-025-01897-7","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

This paper introduces a real-time head-pose detection and eye-gaze estimation system for Automatic Driver Assistance Technology (ADAT) aimed at enhancing driver safety by accurately collecting and transmitting data on the driver’s head position and eye gaze to mitigate potential risks. Existing methods are constrained by significant limitations, including reduced accuracy under challenging conditions such as varying head orientations and lighting, higher latency in real-time applications (e.g., Faster-RCNN and TPH-YOLOv5), and computational inefficiency, which hinders their deployment in resource-constrained environments. To address these challenges, we propose a novel framework using the Transformer Detection of Gaze Head - YOLOv7 (TDGH-YOLOv7) object detector. The key contributions of this work include the development of a reference image dataset encompassing diverse vertical and horizontal gaze positions alongside the implementation of an optimized detection system that achieves state-of-the-art performance in terms of accuracy and latency. The proposed system achieves superior precision, with a weighted accuracy of 95.02% and Root Mean Square Errors of 2.23 and 1.68 for vertical and horizontal gaze estimation, respectively, validated on the MPII-Gaze and DG-Unicamp datasets. A comprehensive comparative analysis with existing models, such as CNN, SSD, Faster-RCNN, and YOLOv8, underscores the robustness and efficiency of the proposed approach. Finally, the implications of these findings are discussed, and potential avenues for future research are outlined.

查看原文本刊更多论文

支持人工智能的驾驶员辅助：监测头部和目光的运动，以提高安全性

本文介绍了一种用于自动驾驶辅助技术（ADAT）的实时头部姿态检测和眼球注视估计系统，旨在通过准确采集和传输驾驶员头部位置和眼球注视数据来降低潜在风险，从而提高驾驶员的安全性。现有的方法受到很大的限制，包括在具有挑战性的条件下（如不同的头部方向和照明）降低精度，实时应用中的更高延迟（例如，Faster-RCNN和TPH-YOLOv5），以及计算效率低下，这阻碍了它们在资源受限环境中的部署。为了解决这些挑战，我们提出了一种使用注视头变压器检测-YOLOv7 （TDGH-YOLOv7）目标检测器的新框架。这项工作的主要贡献包括开发包含不同垂直和水平凝视位置的参考图像数据集，以及实现优化的检测系统，该系统在准确性和延迟方面实现了最先进的性能。在MPII-Gaze和DG-Unicamp数据集上验证，该系统在垂直凝视和水平凝视估计上的加权精度为95.02%，均方根误差分别为2.23和1.68。与现有模型（如CNN、SSD、Faster-RCNN和YOLOv8）的综合比较分析强调了该方法的鲁棒性和效率。最后，讨论了这些发现的意义，并概述了未来研究的潜在途径。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Complex & Intelligent Systems COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-

CiteScore

9.60

自引率

10.30%

发文量

297

期刊介绍： Complex & Intelligent Systems aims to provide a forum for presenting and discussing novel approaches, tools and techniques meant for attaining a cross-fertilization between the broad fields of complex systems, computational simulation, and intelligent analytics and visualization. The transdisciplinary research that the journal focuses on will expand the boundaries of our understanding by investigating the principles and processes that underlie many of the most profound problems facing society today.