Dafei Qin, Hongyang Lin, Qixuan Zhang, Kaichun Qiao, Longwen Zhang, Jun Saito, Zijun Zhao, Jingyi Yu, Lan Xu, Taku Komura
{"title":"Instant Gaussian Splatting Generation for High-Quality and Real-Time Facial Asset Rendering.","authors":"Dafei Qin, Hongyang Lin, Qixuan Zhang, Kaichun Qiao, Longwen Zhang, Jun Saito, Zijun Zhao, Jingyi Yu, Lan Xu, Taku Komura","doi":"10.1109/TPAMI.2025.3550195","DOIUrl":null,"url":null,"abstract":"<p><p>Traditional and AI-driven modeling techniques enable high-fidelity 3D asset generation from scans, videos, or text prompts. However, editing and rendering these assets often involves a trade-off between quality and speed. In this paper, we propose GauFace, a novel Gaussian Splatting representation, tailored for efficient rendering of facial mesh with textures. Then, we introduce TransGS, a diffusion transformer that instantly generates the GauFace assets from mesh, textures and lightning conditions. Specifically, we adopt a patch-based pipeline to handle the vast number of Gaussian Points, a novel texel-aligned sampling scheme with UV positional encoding to enhance the throughput of generating GauFace assets. Once trained, TransGS can generate GauFace assets in 5 seconds, delivering high fidelity and real-time facial interaction of 30fps@1440p to a Snapdragon 8 Gen 2 mobile platform. The rich conditional modalities further enable editing and animation capabilities reminiscent of traditional CG pipelines. We conduct extensive evaluations and user studies, compared to traditional renderers, as well as recent neural rendering methods. They demonstrate the superior performance of our approach for facial asset rendering. We also showcase diverse applications of facial assets using our TransGS approach and GauFace representation, across various platforms like PCs, phones, and VR headsets.</p>","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TPAMI.2025.3550195","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Traditional and AI-driven modeling techniques enable high-fidelity 3D asset generation from scans, videos, or text prompts. However, editing and rendering these assets often involves a trade-off between quality and speed. In this paper, we propose GauFace, a novel Gaussian Splatting representation, tailored for efficient rendering of facial mesh with textures. Then, we introduce TransGS, a diffusion transformer that instantly generates the GauFace assets from mesh, textures and lightning conditions. Specifically, we adopt a patch-based pipeline to handle the vast number of Gaussian Points, a novel texel-aligned sampling scheme with UV positional encoding to enhance the throughput of generating GauFace assets. Once trained, TransGS can generate GauFace assets in 5 seconds, delivering high fidelity and real-time facial interaction of 30fps@1440p to a Snapdragon 8 Gen 2 mobile platform. The rich conditional modalities further enable editing and animation capabilities reminiscent of traditional CG pipelines. We conduct extensive evaluations and user studies, compared to traditional renderers, as well as recent neural rendering methods. They demonstrate the superior performance of our approach for facial asset rendering. We also showcase diverse applications of facial assets using our TransGS approach and GauFace representation, across various platforms like PCs, phones, and VR headsets.
传统和人工智能驱动的建模技术可以从扫描、视频或文本提示中生成高保真的3D资产。然而,编辑和呈现这些资产通常涉及质量和速度之间的权衡。在本文中,我们提出了GauFace,一种新颖的高斯飞溅表示,专门用于有效渲染带有纹理的面部网格。然后,我们介绍TransGS,一个扩散变压器,立即从网格,纹理和闪电条件生成GauFace资产。具体来说,我们采用了基于patch的管道来处理大量的高斯点,一种新颖的纹理对齐采样方案与UV位置编码,以提高生成GauFace资产的吞吐量。经过训练后,TransGS可以在5秒内生成GauFace资产,为Snapdragon 8 Gen 2移动平台提供30fps@1440p的高保真度和实时面部交互。丰富的条件模式进一步使编辑和动画功能让人想起传统的CG管道。与传统的渲染器以及最近的神经渲染方法相比,我们进行了广泛的评估和用户研究。他们展示了我们的面部资源渲染方法的优越性能。我们还使用TransGS方法和GauFace表示,在pc、手机和VR头显等各种平台上展示面部资产的各种应用。