CloCap-GS: Clothed Human Performance Capture With 3D Gaussian Splatting

IF 13.7

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society Pub Date : 2025-07-30 DOI:10.1109/TIP.2025.3592534

Kangkan Wang;Chong Wang;Jian Yang;Guofeng Zhang

{"title":"CloCap-GS: Clothed Human Performance Capture With 3D Gaussian Splatting","authors":"Kangkan Wang;Chong Wang;Jian Yang;Guofeng Zhang","doi":"10.1109/TIP.2025.3592534","DOIUrl":null,"url":null,"abstract":"Capturing the human body and clothing from videos has obtained significant progress in recent years, but several challenges remain to be addressed. Previous methods reconstruct the 3D bodies and garments from videos with self-rotating human motions or capture the body and clothing separately based on neural implicit fields. However, the reconstruction methods for self-rotating motions may cause instable tracking on dynamic videos with arbitrary human motions, while implicit fields based methods are limited to inefficient rendering and low quality synthesis. To solve these problems, we propose a new method, called CloCap-GS, for clothed human performance capture with 3D Gaussian Splatting. Specifically, we align 3D Gaussians with the deforming geometries of body and clothing, and leverage photometric constraints formed by matching Gaussians renderings with input video frames to recover temporal deformations of the dense template geometry. The geometry deformations and Gaussians properties of both the body and clothing are optimized jointly, achieving both dense geometry tracking and novel-view synthesis. In addition, we introduce a physics-aware material-varying cloth model to preserve physically-plausible cloth dynamics and body-clothing interactions that is pre-trained in a self-supervised manner without preparing training data. Compared with the existing methods, our method improves the accuracy of dense geometry tracking and quality of novel-view synthesis for a variety of daily garment types (e.g., loose clothes). Extensive experiments in both quantitative and qualitative evaluations demonstrate the effectiveness of CloCap-GS on real sparse-view or monocular videos.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"5200-5214"},"PeriodicalIF":13.7000,"publicationDate":"2025-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/11104970/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Capturing the human body and clothing from videos has obtained significant progress in recent years, but several challenges remain to be addressed. Previous methods reconstruct the 3D bodies and garments from videos with self-rotating human motions or capture the body and clothing separately based on neural implicit fields. However, the reconstruction methods for self-rotating motions may cause instable tracking on dynamic videos with arbitrary human motions, while implicit fields based methods are limited to inefficient rendering and low quality synthesis. To solve these problems, we propose a new method, called CloCap-GS, for clothed human performance capture with 3D Gaussian Splatting. Specifically, we align 3D Gaussians with the deforming geometries of body and clothing, and leverage photometric constraints formed by matching Gaussians renderings with input video frames to recover temporal deformations of the dense template geometry. The geometry deformations and Gaussians properties of both the body and clothing are optimized jointly, achieving both dense geometry tracking and novel-view synthesis. In addition, we introduce a physics-aware material-varying cloth model to preserve physically-plausible cloth dynamics and body-clothing interactions that is pre-trained in a self-supervised manner without preparing training data. Compared with the existing methods, our method improves the accuracy of dense geometry tracking and quality of novel-view synthesis for a variety of daily garment types (e.g., loose clothes). Extensive experiments in both quantitative and qualitative evaluations demonstrate the effectiveness of CloCap-GS on real sparse-view or monocular videos.

查看原文本刊更多论文

CloCap-GS：穿衣服的人类表现捕捉与3D高斯飞溅

近年来，从视频中捕捉人体和服装已经取得了重大进展，但仍有一些挑战有待解决。以前的方法是根据具有自旋转人体运动的视频重建三维人体和服装，或者基于神经隐式场分别捕获身体和服装。然而，基于自旋转运动的重建方法可能会对任意人体运动的动态视频造成不稳定的跟踪，而基于隐式场的方法则局限于低效的渲染和低质量的合成。为了解决这些问题，我们提出了一种新的方法，称为CloCap-GS，用于三维高斯溅射的服装人体动作捕捉。具体来说，我们将3D高斯图像与身体和衣服的变形几何形状对齐，并利用通过将高斯渲染图与输入视频帧匹配形成的光度约束来恢复密集模板几何形状的时间变形。同时优化了人体和服装的几何变形和高斯特性，实现了密集几何跟踪和新视角合成。此外，我们引入了一个物理感知的材料变化布料模型，以保留物理上合理的布料动力学和身体-服装相互作用，该模型以自我监督的方式进行预训练，而无需准备训练数据。与现有方法相比，我们的方法提高了密集几何跟踪的精度和各种日常服装类型（如宽松服装）的新视图合成质量。大量的定量和定性实验证明了CloCap-GS在真实稀疏视图或单目视频上的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society

自引率

0.00%

发文量