Chunjin Song, Zhijie Wu, Bastian Wandt, Leonid Sigal, Helge Rhodin
{"title":"Representing Animatable Avatar via Factorized Neural Fields","authors":"Chunjin Song, Zhijie Wu, Bastian Wandt, Leonid Sigal, Helge Rhodin","doi":"10.1111/cgf.70192","DOIUrl":null,"url":null,"abstract":"<p>For reconstructing high-fidelity human 3D models from monocular videos, it is crucial to maintain consistent large-scale body shapes along with finely matched subtle wrinkles. This paper explores how per-frame rendering results can be factorized into a pose-independent component and a corresponding pose-dependent counterpart to facilitate frame consistency at multiple scales. Pose adaptive texture features are further improved by restricting the frequency bands of these two components. Pose-independent outputs are expected to be low-frequency, while high-frequency information is linked to pose-dependent factors. We implement this with a dual-branch network. The first branch takes coordinates in the canonical space as input, while the second one additionally considers features outputted by the first branch and pose information of each frame. A final network integrates the information predicted by both branches and utilizes volume rendering to generate photo-realistic 3D human images. Through experiments, we demonstrate that our method consistently surpasses all state-of-the-art methods in preserving high-frequency details and ensuring consistent body contours. Our code is accessible at https://github.com/ChunjinSong/facavatar.</p>","PeriodicalId":10687,"journal":{"name":"Computer Graphics Forum","volume":"44 5","pages":""},"PeriodicalIF":2.9000,"publicationDate":"2025-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/cgf.70192","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Graphics Forum","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/cgf.70192","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
For reconstructing high-fidelity human 3D models from monocular videos, it is crucial to maintain consistent large-scale body shapes along with finely matched subtle wrinkles. This paper explores how per-frame rendering results can be factorized into a pose-independent component and a corresponding pose-dependent counterpart to facilitate frame consistency at multiple scales. Pose adaptive texture features are further improved by restricting the frequency bands of these two components. Pose-independent outputs are expected to be low-frequency, while high-frequency information is linked to pose-dependent factors. We implement this with a dual-branch network. The first branch takes coordinates in the canonical space as input, while the second one additionally considers features outputted by the first branch and pose information of each frame. A final network integrates the information predicted by both branches and utilizes volume rendering to generate photo-realistic 3D human images. Through experiments, we demonstrate that our method consistently surpasses all state-of-the-art methods in preserving high-frequency details and ensuring consistent body contours. Our code is accessible at https://github.com/ChunjinSong/facavatar.
期刊介绍:
Computer Graphics Forum is the official journal of Eurographics, published in cooperation with Wiley-Blackwell, and is a unique, international source of information for computer graphics professionals interested in graphics developments worldwide. It is now one of the leading journals for researchers, developers and users of computer graphics in both commercial and academic environments. The journal reports on the latest developments in the field throughout the world and covers all aspects of the theory, practice and application of computer graphics.