Dense 3D Face Reconstruction from a Single RGB Image

2022 IEEE 25th International Conference on Computational Science and Engineering (CSE) Pub Date : 2022-12-01 DOI:10.1109/CSE57773.2022.00013

Jianxu Mao, Yifeng Zhang, Caiping Liu, Ziming Tao, Junfei Yi, Yaonan Wang

{"title":"Dense 3D Face Reconstruction from a Single RGB Image","authors":"Jianxu Mao, Yifeng Zhang, Caiping Liu, Ziming Tao, Junfei Yi, Yaonan Wang","doi":"10.1109/CSE57773.2022.00013","DOIUrl":null,"url":null,"abstract":"Monocular 3D face reconstruction is a computer vision problem of extraordinary difficulty. Restrictions of large poses and facial details(such as wrinkles, moles, beards etc.) are the common deficiencies of the most existing monocular 3D face reconstruction methods. To resolve the two defects, we propose an end-to-end system to provide 3D reconstructions of faces with details which express robustly under various backgrounds, pose rotations and occlusions. To obtain the facial detail informations, we leverage the image-to-image translation network (we call it p2p-net for short) to make pixel to pixel estimation from the input RGB image to depth map. This precise per-pixel estimation can provide depth value for facial details. And we use a procedure similar to image inpainting to recover the missing details. Simultaneously, for adapting pose rotation and resolving occlusions, we use CNNs to estimate a basic facial model based on 3D Morphable Model(3DMM), which can compensate the unseen facial part in the input image and decrease the deviation of final 3D model by fitting with the dense depth map. We propose an Identity Shape Loss function to enhance the basic facial model and we add a Multi-view Identity Loss that compare the features of the 3D face fusion and the ground truth from multi-view angles. The training data for p2p-net is from 3D scanning system, and we augment the dataset to a larger magnitude for a more generic training. Comparing to other state-of-the-art methods of 3D face reconstruction, we evaluate our method on in-the-wild face images. the qualitative and quantitative comparison show that our method performs both well on robustness and accuracy especially when facing non-frontal pose problems.","PeriodicalId":165085,"journal":{"name":"2022 IEEE 25th International Conference on Computational Science and Engineering (CSE)","volume":"83 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 25th International Conference on Computational Science and Engineering (CSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSE57773.2022.00013","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Monocular 3D face reconstruction is a computer vision problem of extraordinary difficulty. Restrictions of large poses and facial details(such as wrinkles, moles, beards etc.) are the common deficiencies of the most existing monocular 3D face reconstruction methods. To resolve the two defects, we propose an end-to-end system to provide 3D reconstructions of faces with details which express robustly under various backgrounds, pose rotations and occlusions. To obtain the facial detail informations, we leverage the image-to-image translation network (we call it p2p-net for short) to make pixel to pixel estimation from the input RGB image to depth map. This precise per-pixel estimation can provide depth value for facial details. And we use a procedure similar to image inpainting to recover the missing details. Simultaneously, for adapting pose rotation and resolving occlusions, we use CNNs to estimate a basic facial model based on 3D Morphable Model(3DMM), which can compensate the unseen facial part in the input image and decrease the deviation of final 3D model by fitting with the dense depth map. We propose an Identity Shape Loss function to enhance the basic facial model and we add a Multi-view Identity Loss that compare the features of the 3D face fusion and the ground truth from multi-view angles. The training data for p2p-net is from 3D scanning system, and we augment the dataset to a larger magnitude for a more generic training. Comparing to other state-of-the-art methods of 3D face reconstruction, we evaluate our method on in-the-wild face images. the qualitative and quantitative comparison show that our method performs both well on robustness and accuracy especially when facing non-frontal pose problems.

查看原文本刊更多论文

从单个RGB图像重建密集3D人脸

单目三维人脸重建是一个非常困难的计算机视觉问题。对大姿态和面部细节(如皱纹、痣、胡须等)的限制是目前大多数单眼三维人脸重建方法的共同缺陷。为了解决这两个缺陷，我们提出了一个端到端系统，提供具有各种背景，姿态旋转和遮挡下鲁棒表达细节的面部三维重建。为了获得面部细节信息，我们利用图像到图像转换网络(简称p2p-net)从输入的RGB图像到深度图进行像素到像素的估计。这种精确的逐像素估计可以为面部细节提供深度值。我们使用类似于图像修复的程序来恢复缺失的细节。同时，为了适应姿态旋转和分辨遮挡，我们利用cnn估计了一个基于3D变形模型(3DMM)的基本人脸模型，该模型可以补偿输入图像中未见的人脸部分，并通过拟合密集深度图来减小最终3D模型的偏差。我们提出了一个身份形状损失函数来增强基本人脸模型，并增加了一个多视图身份损失函数来比较三维人脸融合的特征和多视角的地面真相。p2p-net的训练数据来自3D扫描系统，我们将数据集扩展到更大的量级以进行更通用的训练。与其他最先进的3D人脸重建方法相比，我们在野外人脸图像上评估了我们的方法。定性和定量对比表明，该方法具有较好的鲁棒性和准确性，特别是在面对非正面姿态问题时。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 IEEE 25th International Conference on Computational Science and Engineering (CSE)

自引率

0.00%

发文量