Neural Networks With Linear Adaptive Batch Normalization and Swarm Intelligence Calibration for Real-Time Gaze Estimation on Smartphones

IF 5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Gancheng Zhu, Yongkai Li, Shuai Zhang, Xiaoting Duan, Zehao Huang, Zhaomin Yao, Rong Wang, Zhiguo Wang
{"title":"Neural Networks With Linear Adaptive Batch Normalization and Swarm Intelligence Calibration for Real-Time Gaze Estimation on Smartphones","authors":"Gancheng Zhu,&nbsp;Yongkai Li,&nbsp;Shuai Zhang,&nbsp;Xiaoting Duan,&nbsp;Zehao Huang,&nbsp;Zhaomin Yao,&nbsp;Rong Wang,&nbsp;Zhiguo Wang","doi":"10.1155/2024/2644725","DOIUrl":null,"url":null,"abstract":"<div>\n <p>Eye tracking has emerged as a valuable tool for both research and clinical applications. However, traditional eye-tracking systems are often bulky and expensive, limiting their widespread adoption in various fields. Smartphone eye tracking has become feasible with advanced deep learning and edge computing technologies. However, the field still faces practical challenges related to large-scale datasets, model inference speed, and gaze estimation accuracy. The present study created a new dataset that contains over 3.2 million face images collected with recent phone models and presents a comprehensive smartphone eye-tracking pipeline comprising a deep neural network framework (MGazeNet), a personalized model calibration method, and a heuristic gaze signal filter. The MGazeNet model introduced a linear adaptive batch normalization module to efficiently combine eye and face features, achieving the state-of-the-art gaze estimation accuracy of 1.59 cm on the GazeCapture dataset and 1.48 cm on our custom dataset. In addition, an algorithm that utilizes multiverse optimization to optimize the hyperparameters of support vector regression (MVO–SVR) was proposed to improve eye-tracking calibration accuracy with 13 or fewer ground-truth gaze points, further improving gaze estimation accuracy to 0.89 cm. This integrated approach allows for eye tracking with accuracy comparable to that of research-grade eye trackers, offering new application possibilities for smartphone eye tracking.</p>\n </div>","PeriodicalId":14089,"journal":{"name":"International Journal of Intelligent Systems","volume":"2024 1","pages":""},"PeriodicalIF":5.0000,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1155/2024/2644725","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Intelligent Systems","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1155/2024/2644725","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Eye tracking has emerged as a valuable tool for both research and clinical applications. However, traditional eye-tracking systems are often bulky and expensive, limiting their widespread adoption in various fields. Smartphone eye tracking has become feasible with advanced deep learning and edge computing technologies. However, the field still faces practical challenges related to large-scale datasets, model inference speed, and gaze estimation accuracy. The present study created a new dataset that contains over 3.2 million face images collected with recent phone models and presents a comprehensive smartphone eye-tracking pipeline comprising a deep neural network framework (MGazeNet), a personalized model calibration method, and a heuristic gaze signal filter. The MGazeNet model introduced a linear adaptive batch normalization module to efficiently combine eye and face features, achieving the state-of-the-art gaze estimation accuracy of 1.59 cm on the GazeCapture dataset and 1.48 cm on our custom dataset. In addition, an algorithm that utilizes multiverse optimization to optimize the hyperparameters of support vector regression (MVO–SVR) was proposed to improve eye-tracking calibration accuracy with 13 or fewer ground-truth gaze points, further improving gaze estimation accuracy to 0.89 cm. This integrated approach allows for eye tracking with accuracy comparable to that of research-grade eye trackers, offering new application possibilities for smartphone eye tracking.

Abstract Image

采用线性自适应批量归一化和群集智能校准的神经网络用于智能手机的实时注视估计
眼动跟踪已成为研究和临床应用的重要工具。然而,传统的眼动追踪系统往往体积庞大、价格昂贵,限制了其在各个领域的广泛应用。借助先进的深度学习和边缘计算技术,智能手机眼动追踪变得可行。然而,该领域仍然面临着与大规模数据集、模型推理速度和注视估计精度有关的实际挑战。本研究创建了一个新的数据集,其中包含用最新手机模型收集的超过 320 万张人脸图像,并提出了一个全面的智能手机眼球跟踪管道,包括一个深度神经网络框架(MGazeNet)、一个个性化模型校准方法和一个启发式凝视信号滤波器。MGazeNet模型引入了线性自适应批量归一化模块,有效地结合了眼部和面部特征,在GazeCapture数据集上实现了1.59厘米的最先进注视估计精度,在我们的自定义数据集上实现了1.48厘米的最先进注视估计精度。此外,我们还提出了一种利用多元宇宙优化来优化支持向量回归超参数(MVO-SVR)的算法,以提高 13 个或更少的地面真实注视点的眼球跟踪校准精度,从而将注视估计精度进一步提高到 0.89 厘米。这种综合方法使眼球跟踪的精确度可与研究级眼球跟踪仪相媲美,为智能手机眼球跟踪提供了新的应用可能性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
International Journal of Intelligent Systems
International Journal of Intelligent Systems 工程技术-计算机:人工智能
CiteScore
11.30
自引率
14.30%
发文量
304
审稿时长
9 months
期刊介绍: The International Journal of Intelligent Systems serves as a forum for individuals interested in tapping into the vast theories based on intelligent systems construction. With its peer-reviewed format, the journal explores several fascinating editorials written by today''s experts in the field. Because new developments are being introduced each day, there''s much to be learned — examination, analysis creation, information retrieval, man–computer interactions, and more. The International Journal of Intelligent Systems uses charts and illustrations to demonstrate these ground-breaking issues, and encourages readers to share their thoughts and experiences.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信