Geometry-Enhanced Implicit Function for Detailed Clothed Human Reconstruction With RGB-D Input

IF 8.4 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Pengpeng Liu, Zhi Zeng, Qisheng Wang, Min Chen, Guixuan Zhang
{"title":"Geometry-Enhanced Implicit Function for Detailed Clothed Human Reconstruction With RGB-D Input","authors":"Pengpeng Liu,&nbsp;Zhi Zeng,&nbsp;Qisheng Wang,&nbsp;Min Chen,&nbsp;Guixuan Zhang","doi":"10.1049/cit2.70009","DOIUrl":null,"url":null,"abstract":"<p>Realistic human reconstruction embraces an extensive range of applications as depth sensors advance. However, current state-of-the-art methods with RGB-D input still suffer from artefacts, such as noisy surfaces, non-human shapes, and depth ambiguity, especially for the invisible parts. The authors observe the main issue is the lack of geometric semantics without using depth input priors fully. This paper focuses on improving the representation ability of implicit function, exploring an effective method to utilise depth-related semantics effectively and efficiently. The proposed geometry-enhanced implicit function enhances the geometric semantics with the extra voxel-aligned features from point clouds, promoting the completion of missing parts for unseen regions while preserving the local details on the input. For incorporating multi-scale pixel-aligned and voxel-aligned features, the authors use the Squeeze-and-Excitation attention to capture and fully use channel interdependencies. For the multi-view reconstruction, the proposed depth-enhanced attention explicitly excites the network to “sense” the geometric structure for a more reasonable feature aggregation. Experiments and results show that our method outperforms current RGB and depth-based SOTA methods on the challenging data from Twindom and Thuman3.0, and achieves a detailed and completed human reconstruction, balancing performance and efficiency well.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 3","pages":"858-870"},"PeriodicalIF":8.4000,"publicationDate":"2025-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70009","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"CAAI Transactions on Intelligence Technology","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/cit2.70009","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Realistic human reconstruction embraces an extensive range of applications as depth sensors advance. However, current state-of-the-art methods with RGB-D input still suffer from artefacts, such as noisy surfaces, non-human shapes, and depth ambiguity, especially for the invisible parts. The authors observe the main issue is the lack of geometric semantics without using depth input priors fully. This paper focuses on improving the representation ability of implicit function, exploring an effective method to utilise depth-related semantics effectively and efficiently. The proposed geometry-enhanced implicit function enhances the geometric semantics with the extra voxel-aligned features from point clouds, promoting the completion of missing parts for unseen regions while preserving the local details on the input. For incorporating multi-scale pixel-aligned and voxel-aligned features, the authors use the Squeeze-and-Excitation attention to capture and fully use channel interdependencies. For the multi-view reconstruction, the proposed depth-enhanced attention explicitly excites the network to “sense” the geometric structure for a more reasonable feature aggregation. Experiments and results show that our method outperforms current RGB and depth-based SOTA methods on the challenging data from Twindom and Thuman3.0, and achieves a detailed and completed human reconstruction, balancing performance and efficiency well.

Abstract Image

基于RGB-D输入的几何增强隐式人体细节重建
随着深度传感器的进步,现实的人体重建包含了广泛的应用。然而,目前使用RGB-D输入的最先进的方法仍然受到伪影的影响,例如噪声表面、非人类形状和深度模糊,特别是对于不可见的部分。作者认为,主要问题是缺乏几何语义,没有充分利用深度输入先验。本文着眼于提高隐函数的表示能力,探索一种有效、高效地利用深度相关语义的有效方法。所提出的几何增强隐式函数通过点云中额外的体素对齐特征增强几何语义,促进未见区域缺失部分的补全,同时保留输入的局部细节。为了结合多尺度像素对齐和体素对齐的特征,作者使用挤压和激励注意来捕获和充分利用通道相互依赖性。对于多视图重建,所提出的深度增强关注明确激发网络“感知”几何结构,以进行更合理的特征聚合。实验和结果表明,该方法在Twindom和Thuman3.0具有挑战性的数据上优于现有的RGB和基于深度的SOTA方法,实现了详细完整的人体重建,并在性能和效率上取得了良好的平衡。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CAAI Transactions on Intelligence Technology
CAAI Transactions on Intelligence Technology COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-
CiteScore
11.00
自引率
3.90%
发文量
134
审稿时长
35 weeks
期刊介绍: CAAI Transactions on Intelligence Technology is a leading venue for original research on the theoretical and experimental aspects of artificial intelligence technology. We are a fully open access journal co-published by the Institution of Engineering and Technology (IET) and the Chinese Association for Artificial Intelligence (CAAI) providing research which is openly accessible to read and share worldwide.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信