采用线性自适应批量归一化和群集智能校准的神经网络用于智能手机的实时注视估计

IF 5 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

International Journal of Intelligent Systems Pub Date : 2024-11-21 DOI:10.1155/2024/2644725

Gancheng Zhu, Yongkai Li, Shuai Zhang, Xiaoting Duan, Zehao Huang, Zhaomin Yao, Rong Wang, Zhiguo Wang

{"title":"采用线性自适应批量归一化和群集智能校准的神经网络用于智能手机的实时注视估计","authors":"Gancheng Zhu, Yongkai Li, Shuai Zhang, Xiaoting Duan, Zehao Huang, Zhaomin Yao, Rong Wang, Zhiguo Wang","doi":"10.1155/2024/2644725","DOIUrl":null,"url":null,"abstract":"<div>\n <p>Eye tracking has emerged as a valuable tool for both research and clinical applications. However, traditional eye-tracking systems are often bulky and expensive, limiting their widespread adoption in various fields. Smartphone eye tracking has become feasible with advanced deep learning and edge computing technologies. However, the field still faces practical challenges related to large-scale datasets, model inference speed, and gaze estimation accuracy. The present study created a new dataset that contains over 3.2 million face images collected with recent phone models and presents a comprehensive smartphone eye-tracking pipeline comprising a deep neural network framework (MGazeNet), a personalized model calibration method, and a heuristic gaze signal filter. The MGazeNet model introduced a linear adaptive batch normalization module to efficiently combine eye and face features, achieving the state-of-the-art gaze estimation accuracy of 1.59 cm on the GazeCapture dataset and 1.48 cm on our custom dataset. In addition, an algorithm that utilizes multiverse optimization to optimize the hyperparameters of support vector regression (MVO–SVR) was proposed to improve eye-tracking calibration accuracy with 13 or fewer ground-truth gaze points, further improving gaze estimation accuracy to 0.89 cm. This integrated approach allows for eye tracking with accuracy comparable to that of research-grade eye trackers, offering new application possibilities for smartphone eye tracking.</p>\n </div>","PeriodicalId":14089,"journal":{"name":"International Journal of Intelligent Systems","volume":"2024 1","pages":""},"PeriodicalIF":5.0000,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1155/2024/2644725","citationCount":"0","resultStr":"{\"title\":\"Neural Networks With Linear Adaptive Batch Normalization and Swarm Intelligence Calibration for Real-Time Gaze Estimation on Smartphones\",\"authors\":\"Gancheng Zhu, Yongkai Li, Shuai Zhang, Xiaoting Duan, Zehao Huang, Zhaomin Yao, Rong Wang, Zhiguo Wang\",\"doi\":\"10.1155/2024/2644725\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n <p>Eye tracking has emerged as a valuable tool for both research and clinical applications. However, traditional eye-tracking systems are often bulky and expensive, limiting their widespread adoption in various fields. Smartphone eye tracking has become feasible with advanced deep learning and edge computing technologies. However, the field still faces practical challenges related to large-scale datasets, model inference speed, and gaze estimation accuracy. The present study created a new dataset that contains over 3.2 million face images collected with recent phone models and presents a comprehensive smartphone eye-tracking pipeline comprising a deep neural network framework (MGazeNet), a personalized model calibration method, and a heuristic gaze signal filter. The MGazeNet model introduced a linear adaptive batch normalization module to efficiently combine eye and face features, achieving the state-of-the-art gaze estimation accuracy of 1.59 cm on the GazeCapture dataset and 1.48 cm on our custom dataset. In addition, an algorithm that utilizes multiverse optimization to optimize the hyperparameters of support vector regression (MVO–SVR) was proposed to improve eye-tracking calibration accuracy with 13 or fewer ground-truth gaze points, further improving gaze estimation accuracy to 0.89 cm. This integrated approach allows for eye tracking with accuracy comparable to that of research-grade eye trackers, offering new application possibilities for smartphone eye tracking.</p>\\n </div>\",\"PeriodicalId\":14089,\"journal\":{\"name\":\"International Journal of Intelligent Systems\",\"volume\":\"2024 1\",\"pages\":\"\"},\"PeriodicalIF\":5.0000,\"publicationDate\":\"2024-11-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1155/2024/2644725\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Intelligent Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1155/2024/2644725\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Intelligent Systems","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1155/2024/2644725","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

眼动跟踪已成为研究和临床应用的重要工具。然而，传统的眼动追踪系统往往体积庞大、价格昂贵，限制了其在各个领域的广泛应用。借助先进的深度学习和边缘计算技术，智能手机眼动追踪变得可行。然而，该领域仍然面临着与大规模数据集、模型推理速度和注视估计精度有关的实际挑战。本研究创建了一个新的数据集，其中包含用最新手机模型收集的超过 320 万张人脸图像，并提出了一个全面的智能手机眼球跟踪管道，包括一个深度神经网络框架（MGazeNet）、一个个性化模型校准方法和一个启发式凝视信号滤波器。MGazeNet模型引入了线性自适应批量归一化模块，有效地结合了眼部和面部特征，在GazeCapture数据集上实现了1.59厘米的最先进注视估计精度，在我们的自定义数据集上实现了1.48厘米的最先进注视估计精度。此外，我们还提出了一种利用多元宇宙优化来优化支持向量回归超参数（MVO-SVR）的算法，以提高 13 个或更少的地面真实注视点的眼球跟踪校准精度，从而将注视估计精度进一步提高到 0.89 厘米。这种综合方法使眼球跟踪的精确度可与研究级眼球跟踪仪相媲美，为智能手机眼球跟踪提供了新的应用可能性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Neural Networks With Linear Adaptive Batch Normalization and Swarm Intelligence Calibration for Real-Time Gaze Estimation on Smartphones

查看原文本刊更多论文

Neural Networks With Linear Adaptive Batch Normalization and Swarm Intelligence Calibration for Real-Time Gaze Estimation on Smartphones

Eye tracking has emerged as a valuable tool for both research and clinical applications. However, traditional eye-tracking systems are often bulky and expensive, limiting their widespread adoption in various fields. Smartphone eye tracking has become feasible with advanced deep learning and edge computing technologies. However, the field still faces practical challenges related to large-scale datasets, model inference speed, and gaze estimation accuracy. The present study created a new dataset that contains over 3.2 million face images collected with recent phone models and presents a comprehensive smartphone eye-tracking pipeline comprising a deep neural network framework (MGazeNet), a personalized model calibration method, and a heuristic gaze signal filter. The MGazeNet model introduced a linear adaptive batch normalization module to efficiently combine eye and face features, achieving the state-of-the-art gaze estimation accuracy of 1.59 cm on the GazeCapture dataset and 1.48 cm on our custom dataset. In addition, an algorithm that utilizes multiverse optimization to optimize the hyperparameters of support vector regression (MVO–SVR) was proposed to improve eye-tracking calibration accuracy with 13 or fewer ground-truth gaze points, further improving gaze estimation accuracy to 0.89 cm. This integrated approach allows for eye tracking with accuracy comparable to that of research-grade eye trackers, offering new application possibilities for smartphone eye tracking.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of Intelligent Systems 工程技术-计算机：人工智能

CiteScore

11.30

自引率

14.30%

发文量

304

审稿时长

9 months

期刊介绍： The International Journal of Intelligent Systems serves as a forum for individuals interested in tapping into the vast theories based on intelligent systems construction. With its peer-reviewed format, the journal explores several fascinating editorials written by today''s experts in the field. Because new developments are being introduced each day, there''s much to be learned — examination, analysis creation, information retrieval, man–computer interactions, and more. The International Journal of Intelligent Systems uses charts and illustrations to demonstrate these ground-breaking issues, and encourages readers to share their thoughts and experiences.