基于自监督学习的测试时间自适应注视估计

IF 4.3 2区计算机科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Transactions on Consumer Electronics Pub Date : 2024-12-27 DOI:10.1109/TCE.2024.3523486

Pengwei Yin;Jingjing Wang;Xiaojun Wu

{"title":"基于自监督学习的测试时间自适应注视估计","authors":"Pengwei Yin;Jingjing Wang;Xiaojun Wu","doi":"10.1109/TCE.2024.3523486","DOIUrl":null,"url":null,"abstract":"Gaze estimation plays a significant role in consumer electronics, particularly in the realm of user interface and interactive technology. While existing methods rely on either few-shot adaptation requiring annotated samples or unsupervised domain adaptation necessitating source domain data, these approaches face limitations due to the high cost of annotation and data privacy concerns. This paper addresses this critical gap by introducing a novel test-time adaptation framework for gaze estimation that operates without the need for source domain data or annotated samples for adaptation. Here, we present a dual-objective training strategy that combines supervised and self-supervised learning on the source domain, with a particular focus on a face and eye reconstruction task designed to enhance the learning of head pose and eye direction features crucial for gaze estimation. At test time, our model undergoes adaptation solely through fine-tuning with the self-supervised objective, optimizing the model’s ability to estimate gaze in new, unseen scenarios. Our extensive experiments on benchmarks validate the effectiveness of our approach, demonstrating improved generalization capabilities without the dependency on expensive annotations or sensitive source domain data.","PeriodicalId":13208,"journal":{"name":"IEEE Transactions on Consumer Electronics","volume":"71 1","pages":"75-89"},"PeriodicalIF":4.3000,"publicationDate":"2024-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Test-Time Adaptation With Self-Supervised Learning for Gaze Estimation\",\"authors\":\"Pengwei Yin;Jingjing Wang;Xiaojun Wu\",\"doi\":\"10.1109/TCE.2024.3523486\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Gaze estimation plays a significant role in consumer electronics, particularly in the realm of user interface and interactive technology. While existing methods rely on either few-shot adaptation requiring annotated samples or unsupervised domain adaptation necessitating source domain data, these approaches face limitations due to the high cost of annotation and data privacy concerns. This paper addresses this critical gap by introducing a novel test-time adaptation framework for gaze estimation that operates without the need for source domain data or annotated samples for adaptation. Here, we present a dual-objective training strategy that combines supervised and self-supervised learning on the source domain, with a particular focus on a face and eye reconstruction task designed to enhance the learning of head pose and eye direction features crucial for gaze estimation. At test time, our model undergoes adaptation solely through fine-tuning with the self-supervised objective, optimizing the model’s ability to estimate gaze in new, unseen scenarios. Our extensive experiments on benchmarks validate the effectiveness of our approach, demonstrating improved generalization capabilities without the dependency on expensive annotations or sensitive source domain data.\",\"PeriodicalId\":13208,\"journal\":{\"name\":\"IEEE Transactions on Consumer Electronics\",\"volume\":\"71 1\",\"pages\":\"75-89\"},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2024-12-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Consumer Electronics\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10817535/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Consumer Electronics","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10817535/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

注视估计在消费类电子产品中扮演着重要的角色，特别是在用户界面和交互技术领域。虽然现有方法依赖于需要注释样本的少量自适应或需要源域数据的无监督域自适应，但由于注释成本高和数据隐私问题，这些方法面临局限性。本文通过引入一种新的测试时间自适应框架来解决这一关键问题，该框架不需要源域数据或注释样本进行自适应。在这里，我们提出了一种双目标训练策略，该策略结合了源域上的监督学习和自监督学习，特别关注面部和眼睛重建任务，旨在增强对凝视估计至关重要的头部姿势和眼睛方向特征的学习。在测试时，我们的模型仅通过与自监督目标的微调来适应，优化模型在新的、未见过的场景中估计凝视的能力。我们在基准测试上的大量实验验证了我们方法的有效性，展示了改进的泛化能力，而不依赖于昂贵的注释或敏感的源域数据。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Test-Time Adaptation With Self-Supervised Learning for Gaze Estimation

Gaze estimation plays a significant role in consumer electronics, particularly in the realm of user interface and interactive technology. While existing methods rely on either few-shot adaptation requiring annotated samples or unsupervised domain adaptation necessitating source domain data, these approaches face limitations due to the high cost of annotation and data privacy concerns. This paper addresses this critical gap by introducing a novel test-time adaptation framework for gaze estimation that operates without the need for source domain data or annotated samples for adaptation. Here, we present a dual-objective training strategy that combines supervised and self-supervised learning on the source domain, with a particular focus on a face and eye reconstruction task designed to enhance the learning of head pose and eye direction features crucial for gaze estimation. At test time, our model undergoes adaptation solely through fine-tuning with the self-supervised objective, optimizing the model’s ability to estimate gaze in new, unseen scenarios. Our extensive experiments on benchmarks validate the effectiveness of our approach, demonstrating improved generalization capabilities without the dependency on expensive annotations or sensitive source domain data.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Consumer Electronics 工程技术-电信学

CiteScore

7.70

自引率

9.30%

发文量

审稿时长

3.3 months

期刊介绍： The main focus for the IEEE Transactions on Consumer Electronics is the engineering and research aspects of the theory, design, construction, manufacture or end use of mass market electronics, systems, software and services for consumers.