Meng-Tzu Chiu;Hsun-Ying Cheng;Chien-Yi Wang;Shang-Hong Lai
{"title":"基于身份风格解缠和深度增强的RGB-D人脸识别","authors":"Meng-Tzu Chiu;Hsun-Ying Cheng;Chien-Yi Wang;Shang-Hong Lai","doi":"10.1109/TBIOM.2022.3233769","DOIUrl":null,"url":null,"abstract":"Deep learning approaches achieve highly accurate face recognition by training the models with huge face image datasets. Unlike 2D face image datasets, there is a lack of large 3D face datasets available to the public. Existing public 3D face datasets were usually collected with few subjects, leading to the over-fitting problem. This paper proposes two CNN models to improve the RGB-D face recognition task. The first is a segmentation-aware depth estimation network, called DepthNet, which estimates depth maps from RGB face images by exploiting semantic segmentation for more accurate face region localization. The other is a novel segmentation-guided RGB-D face recognition model that contains an RGB recognition branch, a depth map recognition branch, and an auxiliary segmentation mask branch. In our multi-modality face recognition model, a feature disentanglement scheme is employed to factorize the feature representation into identity-related and style-related components. DepthNet is applied to augment a large 2D face image dataset to a large RGB-D face dataset, which is used for training our RGB-D face recognition model. Our experimental results show that DepthNet can produce more reliable depth maps from face images with the segmentation mask. Our multi-modality face recognition model fully exploits the depth map and outperforms state-of-the-art methods on several public 3D face datasets with challenging variations.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"5 3","pages":"334-347"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"RGB-D Face Recognition With Identity-Style Disentanglement and Depth Augmentation\",\"authors\":\"Meng-Tzu Chiu;Hsun-Ying Cheng;Chien-Yi Wang;Shang-Hong Lai\",\"doi\":\"10.1109/TBIOM.2022.3233769\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep learning approaches achieve highly accurate face recognition by training the models with huge face image datasets. Unlike 2D face image datasets, there is a lack of large 3D face datasets available to the public. Existing public 3D face datasets were usually collected with few subjects, leading to the over-fitting problem. This paper proposes two CNN models to improve the RGB-D face recognition task. The first is a segmentation-aware depth estimation network, called DepthNet, which estimates depth maps from RGB face images by exploiting semantic segmentation for more accurate face region localization. The other is a novel segmentation-guided RGB-D face recognition model that contains an RGB recognition branch, a depth map recognition branch, and an auxiliary segmentation mask branch. In our multi-modality face recognition model, a feature disentanglement scheme is employed to factorize the feature representation into identity-related and style-related components. DepthNet is applied to augment a large 2D face image dataset to a large RGB-D face dataset, which is used for training our RGB-D face recognition model. Our experimental results show that DepthNet can produce more reliable depth maps from face images with the segmentation mask. Our multi-modality face recognition model fully exploits the depth map and outperforms state-of-the-art methods on several public 3D face datasets with challenging variations.\",\"PeriodicalId\":73307,\"journal\":{\"name\":\"IEEE transactions on biometrics, behavior, and identity science\",\"volume\":\"5 3\",\"pages\":\"334-347\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on biometrics, behavior, and identity science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10011574/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on biometrics, behavior, and identity science","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10011574/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
RGB-D Face Recognition With Identity-Style Disentanglement and Depth Augmentation
Deep learning approaches achieve highly accurate face recognition by training the models with huge face image datasets. Unlike 2D face image datasets, there is a lack of large 3D face datasets available to the public. Existing public 3D face datasets were usually collected with few subjects, leading to the over-fitting problem. This paper proposes two CNN models to improve the RGB-D face recognition task. The first is a segmentation-aware depth estimation network, called DepthNet, which estimates depth maps from RGB face images by exploiting semantic segmentation for more accurate face region localization. The other is a novel segmentation-guided RGB-D face recognition model that contains an RGB recognition branch, a depth map recognition branch, and an auxiliary segmentation mask branch. In our multi-modality face recognition model, a feature disentanglement scheme is employed to factorize the feature representation into identity-related and style-related components. DepthNet is applied to augment a large 2D face image dataset to a large RGB-D face dataset, which is used for training our RGB-D face recognition model. Our experimental results show that DepthNet can produce more reliable depth maps from face images with the segmentation mask. Our multi-modality face recognition model fully exploits the depth map and outperforms state-of-the-art methods on several public 3D face datasets with challenging variations.