UV Gaussians: Joint learning of mesh deformation and Gaussian textures for human avatar modeling

IF 7.2 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Yujiao Jiang , Qingmin Liao , Xiaoyu Li , Li Ma , Qi Zhang , Chaopeng Zhang , Zongqing Lu , Ying Shan
{"title":"UV Gaussians: Joint learning of mesh deformation and Gaussian textures for human avatar modeling","authors":"Yujiao Jiang ,&nbsp;Qingmin Liao ,&nbsp;Xiaoyu Li ,&nbsp;Li Ma ,&nbsp;Qi Zhang ,&nbsp;Chaopeng Zhang ,&nbsp;Zongqing Lu ,&nbsp;Ying Shan","doi":"10.1016/j.knosys.2025.113470","DOIUrl":null,"url":null,"abstract":"<div><div>Reconstructing photo-realistic drivable human avatars from multi-view image sequences has been a popular and challenging topic in the field of computer vision and graphics. While existing NeRF-based methods can achieve high-quality novel view rendering of human models, both training and inference processes are time-consuming. Recent approaches have utilized 3D Gaussians to represent the human body, enabling faster training and rendering. However, they undermine the importance of the mesh guidance and directly predict Gaussians in 3D space with coarse mesh guidance. This hinders the learning procedure of the Gaussians and tends to produce blurry textures. Therefore, this paper proposes UV Gaussians, which models the 3D human body by jointly learning mesh deformations and 2D UV-space Gaussian textures. The method utilizes the embedding of UV map to learn Gaussian textures in 2D space, leveraging the capabilities of powerful 2D networks to extract features. Additionally, through an independent Mesh network, the approach optimizes pose-dependent geometric deformations, thereby guiding Gaussian rendering and significantly enhancing rendering quality. A new dataset of human motion has been collected and processed, which includes multi-view images, scanned models, parametric model registration, and corresponding texture maps. Experimental results demonstrate that the proposed method achieves state-of-the-art synthesis of novel view and novel pose. The code and data will be made available as open-source.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"320 ","pages":"Article 113470"},"PeriodicalIF":7.2000,"publicationDate":"2025-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705125005179","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Reconstructing photo-realistic drivable human avatars from multi-view image sequences has been a popular and challenging topic in the field of computer vision and graphics. While existing NeRF-based methods can achieve high-quality novel view rendering of human models, both training and inference processes are time-consuming. Recent approaches have utilized 3D Gaussians to represent the human body, enabling faster training and rendering. However, they undermine the importance of the mesh guidance and directly predict Gaussians in 3D space with coarse mesh guidance. This hinders the learning procedure of the Gaussians and tends to produce blurry textures. Therefore, this paper proposes UV Gaussians, which models the 3D human body by jointly learning mesh deformations and 2D UV-space Gaussian textures. The method utilizes the embedding of UV map to learn Gaussian textures in 2D space, leveraging the capabilities of powerful 2D networks to extract features. Additionally, through an independent Mesh network, the approach optimizes pose-dependent geometric deformations, thereby guiding Gaussian rendering and significantly enhancing rendering quality. A new dataset of human motion has been collected and processed, which includes multi-view images, scanned models, parametric model registration, and corresponding texture maps. Experimental results demonstrate that the proposed method achieves state-of-the-art synthesis of novel view and novel pose. The code and data will be made available as open-source.

Abstract Image

UV高斯:网格变形和高斯纹理的联合学习,用于人类化身建模
从多视图图像序列中重建逼真的可驱动的人类化身一直是计算机视觉和图形学领域的热门和具有挑战性的课题。虽然现有的基于nerf的方法可以实现高质量的人体模型新视图渲染,但训练和推理过程都很耗时。最近的方法是利用3D高斯模型来表示人体,从而实现更快的训练和渲染。然而,它们破坏了网格制导的重要性,直接用粗网格制导预测三维空间中的高斯分布。这阻碍了高斯函数的学习过程,并容易产生模糊的纹理。因此,本文提出了UV高斯模型,该模型通过联合学习网格变形和二维UV空间高斯纹理来建模三维人体。该方法利用UV贴图的嵌入来学习二维空间中的高斯纹理,利用强大的二维网络提取特征。此外,该方法通过独立的Mesh网络,优化了与姿态相关的几何变形,从而指导高斯渲染,显著提高了渲染质量。采集并处理了一个新的人体运动数据集,该数据集包括多视图图像、扫描模型、参数模型配准和相应的纹理图。实验结果表明,该方法实现了新视角和新姿态的综合。代码和数据将作为开源提供。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Knowledge-Based Systems
Knowledge-Based Systems 工程技术-计算机:人工智能
CiteScore
14.80
自引率
12.50%
发文量
1245
审稿时长
7.8 months
期刊介绍: Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信