Domain Adaptation for Head Pose Estimation Using Relative Pose Consistency

IEEE transactions on biometrics, behavior, and identity science Pub Date : 2023-01-19 DOI:10.1109/TBIOM.2023.3237039

Felix Kuhnke;Jörn Ostermann

{"title":"Domain Adaptation for Head Pose Estimation Using Relative Pose Consistency","authors":"Felix Kuhnke;Jörn Ostermann","doi":"10.1109/TBIOM.2023.3237039","DOIUrl":null,"url":null,"abstract":"Head pose estimation plays a vital role in biometric systems related to facial and human behavior analysis. Typically, neural networks are trained on head pose datasets. Unfortunately, manual or sensor-based annotation of head pose is impractical. A solution is synthetic training data generated from 3D face models, which can provide an infinite number of perfect labels. However, computer generated images only provide an approximation of real-world images, leading to a performance gap between training and application domain. Therefore, there is a need for strategies that allow simultaneous learning on labeled synthetic data and unlabeled real-world data to overcome the domain gap. In this work we propose relative pose consistency, a semi-supervised learning strategy for head pose estimation based on consistency regularization. Consistency regularization enforces consistent network predictions under random image augmentations, including pose-preserving and pose-altering augmentations. We propose a strategy to exploit the relative pose introduced by pose-altering augmentations between augmented image pairs, to allow the network to benefit from relative pose labels during training on unlabeled data. We evaluate our approach in a domain-adaptation scenario and in a commonly used cross-dataset scenario. Furthermore, we reproduce related works to enforce consistent evaluation protocols and show that for both scenarios we outperform SOTA.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"5 3","pages":"348-359"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8423754/10210132/10021684.pdf","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on biometrics, behavior, and identity science","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10021684/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Head pose estimation plays a vital role in biometric systems related to facial and human behavior analysis. Typically, neural networks are trained on head pose datasets. Unfortunately, manual or sensor-based annotation of head pose is impractical. A solution is synthetic training data generated from 3D face models, which can provide an infinite number of perfect labels. However, computer generated images only provide an approximation of real-world images, leading to a performance gap between training and application domain. Therefore, there is a need for strategies that allow simultaneous learning on labeled synthetic data and unlabeled real-world data to overcome the domain gap. In this work we propose relative pose consistency, a semi-supervised learning strategy for head pose estimation based on consistency regularization. Consistency regularization enforces consistent network predictions under random image augmentations, including pose-preserving and pose-altering augmentations. We propose a strategy to exploit the relative pose introduced by pose-altering augmentations between augmented image pairs, to allow the network to benefit from relative pose labels during training on unlabeled data. We evaluate our approach in a domain-adaptation scenario and in a commonly used cross-dataset scenario. Furthermore, we reproduce related works to enforce consistent evaluation protocols and show that for both scenarios we outperform SOTA.

查看原文本刊更多论文

基于相对姿态一致性的头部姿态估计域自适应

头部姿态估计在与面部和人类行为分析相关的生物识别系统中起着至关重要的作用。通常，神经网络是在头部姿势数据集上训练的。不幸的是，手动或基于传感器的头部姿势注释是不切实际的。一个解决方案是由3D人脸模型生成的合成训练数据，它可以提供无限数量的完美标签。然而，计算机生成的图像只提供真实世界图像的近似值，导致训练和应用领域之间的性能差距。因此，需要一种策略，允许在标记的合成数据和未标记的真实世界数据上同时学习，以克服领域差距。在这项工作中，我们提出了一种基于一致性正则化的头部姿势估计的半监督学习策略——相对姿势一致性。一致性正则化强制在随机图像增强下进行一致的网络预测，包括姿态保持和姿态改变增强。我们提出了一种利用增强图像对之间的姿态改变增强引入的相对姿态的策略，使网络在未标记数据的训练过程中受益于相对姿态标签。我们在领域适应场景和常用的跨数据集场景中评估了我们的方法。此外，我们重现了相关的工作，以执行一致的评估协议，并表明在这两种情况下，我们的性能都优于SOTA。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE transactions on biometrics, behavior, and identity science

CiteScore

10.90

自引率

0.00%

发文量