MonoClothCap:从单目RGB视频中实现时间连贯的服装捕获

2020 International Conference on 3D Vision (3DV) Pub Date : 2020-09-22 DOI:10.1109/3DV50981.2020.00042

Donglai Xiang, F. Prada, Chenglei Wu, J. Hodgins

{"title":"MonoClothCap:从单目RGB视频中实现时间连贯的服装捕获","authors":"Donglai Xiang, F. Prada, Chenglei Wu, J. Hodgins","doi":"10.1109/3DV50981.2020.00042","DOIUrl":null,"url":null,"abstract":"We present a method to capture temporally coherent dynamic clothing deformation from a monocular RGB video input. In contrast to the existing literature, our method does not require a pre-scanned personalized mesh template, and thus can be applied to in-the-wild videos. To constrain the output to a valid deformation space, we build statistical deformation models for three types of clothing: T- shirt, short pants and long pants. A differentiable renderer is utilized to align our captured shapes to the input frames by minimizing the difference in both silhouette, segmentation, and texture. We develop a UV texture growing method which expands the visible texture region of the clothing sequentially in order to minimize drift in deformation tracking. We also extract fine-grained wrinkle detail from the input videos by fitting the clothed surface to the normal maps estimated by a convolutional neural network. Our method produces temporally coherent reconstruction of body and clothing from monocular video. We demonstrate successful clothing capture results from a variety of challenging videos. Extensive quantitative experiments demonstrate the effectiveness of our method on metrics including body pose error and surface reconstruction error of the clothing.","PeriodicalId":293399,"journal":{"name":"2020 International Conference on 3D Vision (3DV)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"49","resultStr":"{\"title\":\"MonoClothCap: Towards Temporally Coherent Clothing Capture from Monocular RGB Video\",\"authors\":\"Donglai Xiang, F. Prada, Chenglei Wu, J. Hodgins\",\"doi\":\"10.1109/3DV50981.2020.00042\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present a method to capture temporally coherent dynamic clothing deformation from a monocular RGB video input. In contrast to the existing literature, our method does not require a pre-scanned personalized mesh template, and thus can be applied to in-the-wild videos. To constrain the output to a valid deformation space, we build statistical deformation models for three types of clothing: T- shirt, short pants and long pants. A differentiable renderer is utilized to align our captured shapes to the input frames by minimizing the difference in both silhouette, segmentation, and texture. We develop a UV texture growing method which expands the visible texture region of the clothing sequentially in order to minimize drift in deformation tracking. We also extract fine-grained wrinkle detail from the input videos by fitting the clothed surface to the normal maps estimated by a convolutional neural network. Our method produces temporally coherent reconstruction of body and clothing from monocular video. We demonstrate successful clothing capture results from a variety of challenging videos. Extensive quantitative experiments demonstrate the effectiveness of our method on metrics including body pose error and surface reconstruction error of the clothing.\",\"PeriodicalId\":293399,\"journal\":{\"name\":\"2020 International Conference on 3D Vision (3DV)\",\"volume\":\"30 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-09-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"49\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 International Conference on 3D Vision (3DV)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/3DV50981.2020.00042\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on 3D Vision (3DV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/3DV50981.2020.00042","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 49

摘要

我们提出了一种从单目RGB视频输入中捕获时间相干动态服装变形的方法。与现有文献相比，我们的方法不需要预扫描的个性化网格模板，因此可以应用于野外视频。为了将输出约束到一个有效的变形空间，我们建立了三种服装类型的统计变形模型:T恤、短裤和长裤。一个可微分的渲染器被用来通过最小化轮廓、分割和纹理的差异来将我们捕获的形状与输入帧对齐。为了减少变形跟踪中的漂移，提出了一种UV纹理生长方法，该方法对服装的可见纹理区域进行逐次扩展。我们还通过将衣服表面拟合到卷积神经网络估计的法线映射中，从输入视频中提取细粒度的皱纹细节。我们的方法从单目视频中产生时间连贯的身体和衣服重建。我们展示了成功的服装捕获结果从各种具有挑战性的视频。大量的定量实验证明了该方法在人体姿态误差和服装表面重构误差等指标上的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

MonoClothCap: Towards Temporally Coherent Clothing Capture from Monocular RGB Video

We present a method to capture temporally coherent dynamic clothing deformation from a monocular RGB video input. In contrast to the existing literature, our method does not require a pre-scanned personalized mesh template, and thus can be applied to in-the-wild videos. To constrain the output to a valid deformation space, we build statistical deformation models for three types of clothing: T- shirt, short pants and long pants. A differentiable renderer is utilized to align our captured shapes to the input frames by minimizing the difference in both silhouette, segmentation, and texture. We develop a UV texture growing method which expands the visible texture region of the clothing sequentially in order to minimize drift in deformation tracking. We also extract fine-grained wrinkle detail from the input videos by fitting the clothed surface to the normal maps estimated by a convolutional neural network. Our method produces temporally coherent reconstruction of body and clothing from monocular video. We demonstrate successful clothing capture results from a variety of challenging videos. Extensive quantitative experiments demonstrate the effectiveness of our method on metrics including body pose error and surface reconstruction error of the clothing.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 International Conference on 3D Vision (3DV)

自引率

0.00%

发文量