Zhengdan Yin, Sanping Zhou, Le Wang, Tao Dai, Gang Hua, Nanning Zheng
{"title":"Residual feature learning with hierarchical calibration for gaze estimation","authors":"Zhengdan Yin, Sanping Zhou, Le Wang, Tao Dai, Gang Hua, Nanning Zheng","doi":"10.1007/s00138-024-01545-z","DOIUrl":null,"url":null,"abstract":"<p>Gaze estimation aims to predict accurate gaze direction from natural eye images, which is an extreme challenging task due to both random variations in head pose and person-specific biases. Existing works often independently learn features from binocular images and directly concatenate them for gaze estimation. In this paper, we propose a simple yet effective two-stage framework for gaze estimation, in which both residual feature learning (RFL) and hierarchical gaze calibration (HGC) networks are designed to consistently improve the performance of gaze estimation. Specifically, the RFL network extracts informative features by jointly exploring the symmetric and asymmetric factors between left and right eyes, which can produce accurate initial predictions as much as possible. Besides, the HGC network cascades a personal-specific transform module to further transform the distribution of gaze point from coarse to fine, which can effectively compensate the subjective bias in initial predictions. Extensive experiments on both EVE and MPIIGaze datasets show that our method outperforms the state-of-the-art approaches.</p>","PeriodicalId":51116,"journal":{"name":"Machine Vision and Applications","volume":"2 1","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2024-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Vision and Applications","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00138-024-01545-z","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Gaze estimation aims to predict accurate gaze direction from natural eye images, which is an extreme challenging task due to both random variations in head pose and person-specific biases. Existing works often independently learn features from binocular images and directly concatenate them for gaze estimation. In this paper, we propose a simple yet effective two-stage framework for gaze estimation, in which both residual feature learning (RFL) and hierarchical gaze calibration (HGC) networks are designed to consistently improve the performance of gaze estimation. Specifically, the RFL network extracts informative features by jointly exploring the symmetric and asymmetric factors between left and right eyes, which can produce accurate initial predictions as much as possible. Besides, the HGC network cascades a personal-specific transform module to further transform the distribution of gaze point from coarse to fine, which can effectively compensate the subjective bias in initial predictions. Extensive experiments on both EVE and MPIIGaze datasets show that our method outperforms the state-of-the-art approaches.
期刊介绍:
Machine Vision and Applications publishes high-quality technical contributions in machine vision research and development. Specifically, the editors encourage submittals in all applications and engineering aspects of image-related computing. In particular, original contributions dealing with scientific, commercial, industrial, military, and biomedical applications of machine vision, are all within the scope of the journal.
Particular emphasis is placed on engineering and technology aspects of image processing and computer vision.
The following aspects of machine vision applications are of interest: algorithms, architectures, VLSI implementations, AI techniques and expert systems for machine vision, front-end sensing, multidimensional and multisensor machine vision, real-time techniques, image databases, virtual reality and visualization. Papers must include a significant experimental validation component.