Dafei Qin, Hongyang Lin, Qixuan Zhang, Kaichun Qiao, Longwen Zhang, Jun Saito, Zijun Zhao, Jingyi Yu, Lan Xu, Taku Komura
{"title":"Instant Gaussian Splatting Generation for High-Quality and Real-Time Facial Asset Rendering.","authors":"Dafei Qin, Hongyang Lin, Qixuan Zhang, Kaichun Qiao, Longwen Zhang, Jun Saito, Zijun Zhao, Jingyi Yu, Lan Xu, Taku Komura","doi":"10.1109/TPAMI.2025.3550195","DOIUrl":null,"url":null,"abstract":"<p><p>Traditional and AI-driven modeling techniques enable high-fidelity 3D asset generation from scans, videos, or text prompts. However, editing and rendering these assets often involves a trade-off between quality and speed. In this paper, we propose GauFace, a novel Gaussian Splatting representation, tailored for efficient rendering of facial mesh with textures. Then, we introduce TransGS, a diffusion transformer that instantly generates the GauFace assets from mesh, textures and lightning conditions. Specifically, we adopt a patch-based pipeline to handle the vast number of Gaussian Points, a novel texel-aligned sampling scheme with UV positional encoding to enhance the throughput of generating GauFace assets. Once trained, TransGS can generate GauFace assets in 5 seconds, delivering high fidelity and real-time facial interaction of 30fps@1440p to a Snapdragon 8 Gen 2 mobile platform. The rich conditional modalities further enable editing and animation capabilities reminiscent of traditional CG pipelines. We conduct extensive evaluations and user studies, compared to traditional renderers, as well as recent neural rendering methods. They demonstrate the superior performance of our approach for facial asset rendering. We also showcase diverse applications of facial assets using our TransGS approach and GauFace representation, across various platforms like PCs, phones, and VR headsets.</p>","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TPAMI.2025.3550195","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Traditional and AI-driven modeling techniques enable high-fidelity 3D asset generation from scans, videos, or text prompts. However, editing and rendering these assets often involves a trade-off between quality and speed. In this paper, we propose GauFace, a novel Gaussian Splatting representation, tailored for efficient rendering of facial mesh with textures. Then, we introduce TransGS, a diffusion transformer that instantly generates the GauFace assets from mesh, textures and lightning conditions. Specifically, we adopt a patch-based pipeline to handle the vast number of Gaussian Points, a novel texel-aligned sampling scheme with UV positional encoding to enhance the throughput of generating GauFace assets. Once trained, TransGS can generate GauFace assets in 5 seconds, delivering high fidelity and real-time facial interaction of 30fps@1440p to a Snapdragon 8 Gen 2 mobile platform. The rich conditional modalities further enable editing and animation capabilities reminiscent of traditional CG pipelines. We conduct extensive evaluations and user studies, compared to traditional renderers, as well as recent neural rendering methods. They demonstrate the superior performance of our approach for facial asset rendering. We also showcase diverse applications of facial assets using our TransGS approach and GauFace representation, across various platforms like PCs, phones, and VR headsets.