Pengpeng Liu, Zhi Zeng, Qisheng Wang, Min Chen, Guixuan Zhang
{"title":"Geometry-Enhanced Implicit Function for Detailed Clothed Human Reconstruction With RGB-D Input","authors":"Pengpeng Liu, Zhi Zeng, Qisheng Wang, Min Chen, Guixuan Zhang","doi":"10.1049/cit2.70009","DOIUrl":null,"url":null,"abstract":"<p>Realistic human reconstruction embraces an extensive range of applications as depth sensors advance. However, current state-of-the-art methods with RGB-D input still suffer from artefacts, such as noisy surfaces, non-human shapes, and depth ambiguity, especially for the invisible parts. The authors observe the main issue is the lack of geometric semantics without using depth input priors fully. This paper focuses on improving the representation ability of implicit function, exploring an effective method to utilise depth-related semantics effectively and efficiently. The proposed geometry-enhanced implicit function enhances the geometric semantics with the extra voxel-aligned features from point clouds, promoting the completion of missing parts for unseen regions while preserving the local details on the input. For incorporating multi-scale pixel-aligned and voxel-aligned features, the authors use the Squeeze-and-Excitation attention to capture and fully use channel interdependencies. For the multi-view reconstruction, the proposed depth-enhanced attention explicitly excites the network to “sense” the geometric structure for a more reasonable feature aggregation. Experiments and results show that our method outperforms current RGB and depth-based SOTA methods on the challenging data from Twindom and Thuman3.0, and achieves a detailed and completed human reconstruction, balancing performance and efficiency well.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 3","pages":"858-870"},"PeriodicalIF":8.4000,"publicationDate":"2025-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70009","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"CAAI Transactions on Intelligence Technology","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/cit2.70009","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Realistic human reconstruction embraces an extensive range of applications as depth sensors advance. However, current state-of-the-art methods with RGB-D input still suffer from artefacts, such as noisy surfaces, non-human shapes, and depth ambiguity, especially for the invisible parts. The authors observe the main issue is the lack of geometric semantics without using depth input priors fully. This paper focuses on improving the representation ability of implicit function, exploring an effective method to utilise depth-related semantics effectively and efficiently. The proposed geometry-enhanced implicit function enhances the geometric semantics with the extra voxel-aligned features from point clouds, promoting the completion of missing parts for unseen regions while preserving the local details on the input. For incorporating multi-scale pixel-aligned and voxel-aligned features, the authors use the Squeeze-and-Excitation attention to capture and fully use channel interdependencies. For the multi-view reconstruction, the proposed depth-enhanced attention explicitly excites the network to “sense” the geometric structure for a more reasonable feature aggregation. Experiments and results show that our method outperforms current RGB and depth-based SOTA methods on the challenging data from Twindom and Thuman3.0, and achieves a detailed and completed human reconstruction, balancing performance and efficiency well.
期刊介绍:
CAAI Transactions on Intelligence Technology is a leading venue for original research on the theoretical and experimental aspects of artificial intelligence technology. We are a fully open access journal co-published by the Institution of Engineering and Technology (IET) and the Chinese Association for Artificial Intelligence (CAAI) providing research which is openly accessible to read and share worldwide.