{"title":"基于相对姿态一致性的头部姿态估计域自适应","authors":"Felix Kuhnke;Jörn Ostermann","doi":"10.1109/TBIOM.2023.3237039","DOIUrl":null,"url":null,"abstract":"Head pose estimation plays a vital role in biometric systems related to facial and human behavior analysis. Typically, neural networks are trained on head pose datasets. Unfortunately, manual or sensor-based annotation of head pose is impractical. A solution is synthetic training data generated from 3D face models, which can provide an infinite number of perfect labels. However, computer generated images only provide an approximation of real-world images, leading to a performance gap between training and application domain. Therefore, there is a need for strategies that allow simultaneous learning on labeled synthetic data and unlabeled real-world data to overcome the domain gap. In this work we propose relative pose consistency, a semi-supervised learning strategy for head pose estimation based on consistency regularization. Consistency regularization enforces consistent network predictions under random image augmentations, including pose-preserving and pose-altering augmentations. We propose a strategy to exploit the relative pose introduced by pose-altering augmentations between augmented image pairs, to allow the network to benefit from relative pose labels during training on unlabeled data. We evaluate our approach in a domain-adaptation scenario and in a commonly used cross-dataset scenario. Furthermore, we reproduce related works to enforce consistent evaluation protocols and show that for both scenarios we outperform SOTA.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"5 3","pages":"348-359"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8423754/10210132/10021684.pdf","citationCount":"1","resultStr":"{\"title\":\"Domain Adaptation for Head Pose Estimation Using Relative Pose Consistency\",\"authors\":\"Felix Kuhnke;Jörn Ostermann\",\"doi\":\"10.1109/TBIOM.2023.3237039\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Head pose estimation plays a vital role in biometric systems related to facial and human behavior analysis. Typically, neural networks are trained on head pose datasets. Unfortunately, manual or sensor-based annotation of head pose is impractical. A solution is synthetic training data generated from 3D face models, which can provide an infinite number of perfect labels. However, computer generated images only provide an approximation of real-world images, leading to a performance gap between training and application domain. Therefore, there is a need for strategies that allow simultaneous learning on labeled synthetic data and unlabeled real-world data to overcome the domain gap. In this work we propose relative pose consistency, a semi-supervised learning strategy for head pose estimation based on consistency regularization. Consistency regularization enforces consistent network predictions under random image augmentations, including pose-preserving and pose-altering augmentations. We propose a strategy to exploit the relative pose introduced by pose-altering augmentations between augmented image pairs, to allow the network to benefit from relative pose labels during training on unlabeled data. We evaluate our approach in a domain-adaptation scenario and in a commonly used cross-dataset scenario. Furthermore, we reproduce related works to enforce consistent evaluation protocols and show that for both scenarios we outperform SOTA.\",\"PeriodicalId\":73307,\"journal\":{\"name\":\"IEEE transactions on biometrics, behavior, and identity science\",\"volume\":\"5 3\",\"pages\":\"348-359\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/iel7/8423754/10210132/10021684.pdf\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on biometrics, behavior, and identity science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10021684/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on biometrics, behavior, and identity science","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10021684/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Domain Adaptation for Head Pose Estimation Using Relative Pose Consistency
Head pose estimation plays a vital role in biometric systems related to facial and human behavior analysis. Typically, neural networks are trained on head pose datasets. Unfortunately, manual or sensor-based annotation of head pose is impractical. A solution is synthetic training data generated from 3D face models, which can provide an infinite number of perfect labels. However, computer generated images only provide an approximation of real-world images, leading to a performance gap between training and application domain. Therefore, there is a need for strategies that allow simultaneous learning on labeled synthetic data and unlabeled real-world data to overcome the domain gap. In this work we propose relative pose consistency, a semi-supervised learning strategy for head pose estimation based on consistency regularization. Consistency regularization enforces consistent network predictions under random image augmentations, including pose-preserving and pose-altering augmentations. We propose a strategy to exploit the relative pose introduced by pose-altering augmentations between augmented image pairs, to allow the network to benefit from relative pose labels during training on unlabeled data. We evaluate our approach in a domain-adaptation scenario and in a commonly used cross-dataset scenario. Furthermore, we reproduce related works to enforce consistent evaluation protocols and show that for both scenarios we outperform SOTA.