通过分解神经场表示可动画的化身

IF 2.9 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING
Chunjin Song, Zhijie Wu, Bastian Wandt, Leonid Sigal, Helge Rhodin
{"title":"通过分解神经场表示可动画的化身","authors":"Chunjin Song,&nbsp;Zhijie Wu,&nbsp;Bastian Wandt,&nbsp;Leonid Sigal,&nbsp;Helge Rhodin","doi":"10.1111/cgf.70192","DOIUrl":null,"url":null,"abstract":"<p>For reconstructing high-fidelity human 3D models from monocular videos, it is crucial to maintain consistent large-scale body shapes along with finely matched subtle wrinkles. This paper explores how per-frame rendering results can be factorized into a pose-independent component and a corresponding pose-dependent counterpart to facilitate frame consistency at multiple scales. Pose adaptive texture features are further improved by restricting the frequency bands of these two components. Pose-independent outputs are expected to be low-frequency, while high-frequency information is linked to pose-dependent factors. We implement this with a dual-branch network. The first branch takes coordinates in the canonical space as input, while the second one additionally considers features outputted by the first branch and pose information of each frame. A final network integrates the information predicted by both branches and utilizes volume rendering to generate photo-realistic 3D human images. Through experiments, we demonstrate that our method consistently surpasses all state-of-the-art methods in preserving high-frequency details and ensuring consistent body contours. Our code is accessible at https://github.com/ChunjinSong/facavatar.</p>","PeriodicalId":10687,"journal":{"name":"Computer Graphics Forum","volume":"44 5","pages":""},"PeriodicalIF":2.9000,"publicationDate":"2025-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/cgf.70192","citationCount":"0","resultStr":"{\"title\":\"Representing Animatable Avatar via Factorized Neural Fields\",\"authors\":\"Chunjin Song,&nbsp;Zhijie Wu,&nbsp;Bastian Wandt,&nbsp;Leonid Sigal,&nbsp;Helge Rhodin\",\"doi\":\"10.1111/cgf.70192\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>For reconstructing high-fidelity human 3D models from monocular videos, it is crucial to maintain consistent large-scale body shapes along with finely matched subtle wrinkles. This paper explores how per-frame rendering results can be factorized into a pose-independent component and a corresponding pose-dependent counterpart to facilitate frame consistency at multiple scales. Pose adaptive texture features are further improved by restricting the frequency bands of these two components. Pose-independent outputs are expected to be low-frequency, while high-frequency information is linked to pose-dependent factors. We implement this with a dual-branch network. The first branch takes coordinates in the canonical space as input, while the second one additionally considers features outputted by the first branch and pose information of each frame. A final network integrates the information predicted by both branches and utilizes volume rendering to generate photo-realistic 3D human images. Through experiments, we demonstrate that our method consistently surpasses all state-of-the-art methods in preserving high-frequency details and ensuring consistent body contours. Our code is accessible at https://github.com/ChunjinSong/facavatar.</p>\",\"PeriodicalId\":10687,\"journal\":{\"name\":\"Computer Graphics Forum\",\"volume\":\"44 5\",\"pages\":\"\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2025-08-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1111/cgf.70192\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Graphics Forum\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/cgf.70192\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Graphics Forum","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/cgf.70192","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0

摘要

为了从单目视频中重建高保真的人体3D模型,保持一致的大规模身体形状以及精细匹配的细微皱纹至关重要。本文探讨了如何将每帧渲染结果分解为与姿态无关的组件和相应的与姿态相关的对应组件,以促进多尺度下的帧一致性。通过限制这两个分量的频带,进一步改进姿态自适应纹理特征。与姿态无关的输出预计是低频的,而高频信息与姿态相关的因素有关。我们通过双分支网络实现这一点。第一个分支以规范空间中的坐标作为输入,第二个分支额外考虑第一个分支输出的特征和每帧的位姿信息。最终的网络集成了两个分支预测的信息,并利用体绘制来生成逼真的3D人体图像。通过实验,我们证明我们的方法在保留高频细节和确保一致的身体轮廓方面始终优于所有最先进的方法。我们的代码可以在https://github.com/ChunjinSong/facavatar上访问。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Representing Animatable Avatar via Factorized Neural Fields

Representing Animatable Avatar via Factorized Neural Fields

For reconstructing high-fidelity human 3D models from monocular videos, it is crucial to maintain consistent large-scale body shapes along with finely matched subtle wrinkles. This paper explores how per-frame rendering results can be factorized into a pose-independent component and a corresponding pose-dependent counterpart to facilitate frame consistency at multiple scales. Pose adaptive texture features are further improved by restricting the frequency bands of these two components. Pose-independent outputs are expected to be low-frequency, while high-frequency information is linked to pose-dependent factors. We implement this with a dual-branch network. The first branch takes coordinates in the canonical space as input, while the second one additionally considers features outputted by the first branch and pose information of each frame. A final network integrates the information predicted by both branches and utilizes volume rendering to generate photo-realistic 3D human images. Through experiments, we demonstrate that our method consistently surpasses all state-of-the-art methods in preserving high-frequency details and ensuring consistent body contours. Our code is accessible at https://github.com/ChunjinSong/facavatar.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Computer Graphics Forum
Computer Graphics Forum 工程技术-计算机:软件工程
CiteScore
5.80
自引率
12.00%
发文量
175
审稿时长
3-6 weeks
期刊介绍: Computer Graphics Forum is the official journal of Eurographics, published in cooperation with Wiley-Blackwell, and is a unique, international source of information for computer graphics professionals interested in graphics developments worldwide. It is now one of the leading journals for researchers, developers and users of computer graphics in both commercial and academic environments. The journal reports on the latest developments in the field throughout the world and covers all aspects of the theory, practice and application of computer graphics.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信