Yujiao Jiang , Qingmin Liao , Xiaoyu Li , Li Ma , Qi Zhang , Chaopeng Zhang , Zongqing Lu , Ying Shan
{"title":"UV Gaussians: Joint learning of mesh deformation and Gaussian textures for human avatar modeling","authors":"Yujiao Jiang , Qingmin Liao , Xiaoyu Li , Li Ma , Qi Zhang , Chaopeng Zhang , Zongqing Lu , Ying Shan","doi":"10.1016/j.knosys.2025.113470","DOIUrl":null,"url":null,"abstract":"<div><div>Reconstructing photo-realistic drivable human avatars from multi-view image sequences has been a popular and challenging topic in the field of computer vision and graphics. While existing NeRF-based methods can achieve high-quality novel view rendering of human models, both training and inference processes are time-consuming. Recent approaches have utilized 3D Gaussians to represent the human body, enabling faster training and rendering. However, they undermine the importance of the mesh guidance and directly predict Gaussians in 3D space with coarse mesh guidance. This hinders the learning procedure of the Gaussians and tends to produce blurry textures. Therefore, this paper proposes UV Gaussians, which models the 3D human body by jointly learning mesh deformations and 2D UV-space Gaussian textures. The method utilizes the embedding of UV map to learn Gaussian textures in 2D space, leveraging the capabilities of powerful 2D networks to extract features. Additionally, through an independent Mesh network, the approach optimizes pose-dependent geometric deformations, thereby guiding Gaussian rendering and significantly enhancing rendering quality. A new dataset of human motion has been collected and processed, which includes multi-view images, scanned models, parametric model registration, and corresponding texture maps. Experimental results demonstrate that the proposed method achieves state-of-the-art synthesis of novel view and novel pose. The code and data will be made available as open-source.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"320 ","pages":"Article 113470"},"PeriodicalIF":7.2000,"publicationDate":"2025-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705125005179","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Reconstructing photo-realistic drivable human avatars from multi-view image sequences has been a popular and challenging topic in the field of computer vision and graphics. While existing NeRF-based methods can achieve high-quality novel view rendering of human models, both training and inference processes are time-consuming. Recent approaches have utilized 3D Gaussians to represent the human body, enabling faster training and rendering. However, they undermine the importance of the mesh guidance and directly predict Gaussians in 3D space with coarse mesh guidance. This hinders the learning procedure of the Gaussians and tends to produce blurry textures. Therefore, this paper proposes UV Gaussians, which models the 3D human body by jointly learning mesh deformations and 2D UV-space Gaussian textures. The method utilizes the embedding of UV map to learn Gaussian textures in 2D space, leveraging the capabilities of powerful 2D networks to extract features. Additionally, through an independent Mesh network, the approach optimizes pose-dependent geometric deformations, thereby guiding Gaussian rendering and significantly enhancing rendering quality. A new dataset of human motion has been collected and processed, which includes multi-view images, scanned models, parametric model registration, and corresponding texture maps. Experimental results demonstrate that the proposed method achieves state-of-the-art synthesis of novel view and novel pose. The code and data will be made available as open-source.
期刊介绍:
Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.