Face reconstruction with detailed skin features via three selfie images

IF 3.1 4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of Visual Communication and Image Representation Pub Date : 2025-07-26 DOI:10.1016/j.jvcir.2025.104529

Yakun Ju , Bandara Dissanayake , Rachel Ang , Ling Li , Dennis Sng , Alex Kot

{"title":"Face reconstruction with detailed skin features via three selfie images","authors":"Yakun Ju , Bandara Dissanayake , Rachel Ang , Ling Li , Dennis Sng , Alex Kot","doi":"10.1016/j.jvcir.2025.104529","DOIUrl":null,"url":null,"abstract":"<div><div>Accurate 3D reconstruction of facial skin features, such as acne, pigmentation, and wrinkles, is essential for digital facial analysis, virtual aesthetics, and dermatological diagnostics. However, achieving high-fidelity skin detail reconstruction from limited, in-the-wild inputs like selfie images remains a largely underexplored challenge. The Hierarchical Representation Network (HRN) excels in reconstructing facial geometry from limited images but faces challenges in skin detail fidelity and multi-view matching. In this work, we present a lightweight and deployable system that reconstructs detailed 3D face models from only three guided portrait images. We address these limitations by enhancing HRN’s output resolution, improving skin detail precision, and introducing a novel multi-view texture map fusion framework with illumination normalization and linear blending, enhancing texture clarity. To correct eye direction inconsistencies, we integrate a segmentation network to refine eye regions. We further develop a mobile-based prototype application that guides users through video-based face capture and enables real-time model generation. The system has been successfully applied in real-world settings. Our dataset, featuring annotated portraits of fair-skinned Asian females, with visible skin conditions, serves as a benchmark for evaluation. This is the first benchmark focusing on skin-level 3D reconstruction from selfie-level inputs. We validated our method through ablation, comparison, and perception studies, all of which demonstrated clear improvements in texture fidelity and fine detail. These results indicate the method’s practical value for 3D facial skin reconstruction.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"111 ","pages":"Article 104529"},"PeriodicalIF":3.1000,"publicationDate":"2025-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Visual Communication and Image Representation","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1047320325001439","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Accurate 3D reconstruction of facial skin features, such as acne, pigmentation, and wrinkles, is essential for digital facial analysis, virtual aesthetics, and dermatological diagnostics. However, achieving high-fidelity skin detail reconstruction from limited, in-the-wild inputs like selfie images remains a largely underexplored challenge. The Hierarchical Representation Network (HRN) excels in reconstructing facial geometry from limited images but faces challenges in skin detail fidelity and multi-view matching. In this work, we present a lightweight and deployable system that reconstructs detailed 3D face models from only three guided portrait images. We address these limitations by enhancing HRN’s output resolution, improving skin detail precision, and introducing a novel multi-view texture map fusion framework with illumination normalization and linear blending, enhancing texture clarity. To correct eye direction inconsistencies, we integrate a segmentation network to refine eye regions. We further develop a mobile-based prototype application that guides users through video-based face capture and enables real-time model generation. The system has been successfully applied in real-world settings. Our dataset, featuring annotated portraits of fair-skinned Asian females, with visible skin conditions, serves as a benchmark for evaluation. This is the first benchmark focusing on skin-level 3D reconstruction from selfie-level inputs. We validated our method through ablation, comparison, and perception studies, all of which demonstrated clear improvements in texture fidelity and fine detail. These results indicate the method’s practical value for 3D facial skin reconstruction.

查看原文本刊更多论文

通过三张自拍照重建面部细节皮肤特征

面部皮肤特征的精确3D重建，如痤疮、色素沉着和皱纹，对于数字面部分析、虚拟美学和皮肤病诊断至关重要。然而，从有限的、野外的输入（如自拍图像）中实现高保真的皮肤细节重建仍然是一个很大程度上未被探索的挑战。分层表示网络（HRN）在从有限的图像中重建面部几何形状方面表现优异，但在皮肤细节保真度和多视图匹配方面面临挑战。在这项工作中，我们提出了一个轻量级的可部署系统，该系统仅从三个引导肖像图像中重建详细的3D面部模型。我们通过提高HRN的输出分辨率，提高皮肤细节精度，并引入一种具有光照归一化和线性混合的新型多视图纹理图融合框架来提高纹理清晰度，从而解决了这些限制。为了纠正眼睛方向不一致，我们整合了一个分割网络来细化眼睛区域。我们进一步开发了一个基于移动的原型应用程序，指导用户通过基于视频的面部捕捉，并实现实时模型生成。该系统已成功应用于实际环境中。我们的数据集以皮肤白皙的亚洲女性的注释肖像为特征，具有可见的皮肤状况，可作为评估的基准。这是第一个专注于从自拍级输入进行皮肤级3D重建的基准。我们通过消融、对比和感知研究验证了我们的方法，所有这些都证明了纹理保真度和精细细节的明显改善。结果表明该方法在面部三维皮肤重建中具有一定的实用价值。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Visual Communication and Image Representation 工程技术-计算机：软件工程

CiteScore

5.40

自引率

11.50%

发文量

188

审稿时长

9.9 months

期刊介绍： The Journal of Visual Communication and Image Representation publishes papers on state-of-the-art visual communication and image representation, with emphasis on novel technologies and theoretical work in this multidisciplinary area of pure and applied research. The field of visual communication and image representation is considered in its broadest sense and covers both digital and analog aspects as well as processing and communication in biological visual systems.