{"title":"从单个RGB图像重建密集3D人脸","authors":"Jianxu Mao, Yifeng Zhang, Caiping Liu, Ziming Tao, Junfei Yi, Yaonan Wang","doi":"10.1109/CSE57773.2022.00013","DOIUrl":null,"url":null,"abstract":"Monocular 3D face reconstruction is a computer vision problem of extraordinary difficulty. Restrictions of large poses and facial details(such as wrinkles, moles, beards etc.) are the common deficiencies of the most existing monocular 3D face reconstruction methods. To resolve the two defects, we propose an end-to-end system to provide 3D reconstructions of faces with details which express robustly under various backgrounds, pose rotations and occlusions. To obtain the facial detail informations, we leverage the image-to-image translation network (we call it p2p-net for short) to make pixel to pixel estimation from the input RGB image to depth map. This precise per-pixel estimation can provide depth value for facial details. And we use a procedure similar to image inpainting to recover the missing details. Simultaneously, for adapting pose rotation and resolving occlusions, we use CNNs to estimate a basic facial model based on 3D Morphable Model(3DMM), which can compensate the unseen facial part in the input image and decrease the deviation of final 3D model by fitting with the dense depth map. We propose an Identity Shape Loss function to enhance the basic facial model and we add a Multi-view Identity Loss that compare the features of the 3D face fusion and the ground truth from multi-view angles. The training data for p2p-net is from 3D scanning system, and we augment the dataset to a larger magnitude for a more generic training. Comparing to other state-of-the-art methods of 3D face reconstruction, we evaluate our method on in-the-wild face images. the qualitative and quantitative comparison show that our method performs both well on robustness and accuracy especially when facing non-frontal pose problems.","PeriodicalId":165085,"journal":{"name":"2022 IEEE 25th International Conference on Computational Science and Engineering (CSE)","volume":"83 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Dense 3D Face Reconstruction from a Single RGB Image\",\"authors\":\"Jianxu Mao, Yifeng Zhang, Caiping Liu, Ziming Tao, Junfei Yi, Yaonan Wang\",\"doi\":\"10.1109/CSE57773.2022.00013\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Monocular 3D face reconstruction is a computer vision problem of extraordinary difficulty. Restrictions of large poses and facial details(such as wrinkles, moles, beards etc.) are the common deficiencies of the most existing monocular 3D face reconstruction methods. To resolve the two defects, we propose an end-to-end system to provide 3D reconstructions of faces with details which express robustly under various backgrounds, pose rotations and occlusions. To obtain the facial detail informations, we leverage the image-to-image translation network (we call it p2p-net for short) to make pixel to pixel estimation from the input RGB image to depth map. This precise per-pixel estimation can provide depth value for facial details. And we use a procedure similar to image inpainting to recover the missing details. Simultaneously, for adapting pose rotation and resolving occlusions, we use CNNs to estimate a basic facial model based on 3D Morphable Model(3DMM), which can compensate the unseen facial part in the input image and decrease the deviation of final 3D model by fitting with the dense depth map. We propose an Identity Shape Loss function to enhance the basic facial model and we add a Multi-view Identity Loss that compare the features of the 3D face fusion and the ground truth from multi-view angles. The training data for p2p-net is from 3D scanning system, and we augment the dataset to a larger magnitude for a more generic training. Comparing to other state-of-the-art methods of 3D face reconstruction, we evaluate our method on in-the-wild face images. the qualitative and quantitative comparison show that our method performs both well on robustness and accuracy especially when facing non-frontal pose problems.\",\"PeriodicalId\":165085,\"journal\":{\"name\":\"2022 IEEE 25th International Conference on Computational Science and Engineering (CSE)\",\"volume\":\"83 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE 25th International Conference on Computational Science and Engineering (CSE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CSE57773.2022.00013\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 25th International Conference on Computational Science and Engineering (CSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSE57773.2022.00013","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Dense 3D Face Reconstruction from a Single RGB Image
Monocular 3D face reconstruction is a computer vision problem of extraordinary difficulty. Restrictions of large poses and facial details(such as wrinkles, moles, beards etc.) are the common deficiencies of the most existing monocular 3D face reconstruction methods. To resolve the two defects, we propose an end-to-end system to provide 3D reconstructions of faces with details which express robustly under various backgrounds, pose rotations and occlusions. To obtain the facial detail informations, we leverage the image-to-image translation network (we call it p2p-net for short) to make pixel to pixel estimation from the input RGB image to depth map. This precise per-pixel estimation can provide depth value for facial details. And we use a procedure similar to image inpainting to recover the missing details. Simultaneously, for adapting pose rotation and resolving occlusions, we use CNNs to estimate a basic facial model based on 3D Morphable Model(3DMM), which can compensate the unseen facial part in the input image and decrease the deviation of final 3D model by fitting with the dense depth map. We propose an Identity Shape Loss function to enhance the basic facial model and we add a Multi-view Identity Loss that compare the features of the 3D face fusion and the ground truth from multi-view angles. The training data for p2p-net is from 3D scanning system, and we augment the dataset to a larger magnitude for a more generic training. Comparing to other state-of-the-art methods of 3D face reconstruction, we evaluate our method on in-the-wild face images. the qualitative and quantitative comparison show that our method performs both well on robustness and accuracy especially when facing non-frontal pose problems.