ClipFace:文本引导的纹理3D变形模型编辑

ACM SIGGRAPH 2023 Conference Proceedings Pub Date : 2022-12-02 DOI:10.1145/3588432.3591566

Shivangi Aneja, Justus Thies, Angela Dai, M. Nießner

{"title":"ClipFace:文本引导的纹理3D变形模型编辑","authors":"Shivangi Aneja, Justus Thies, Angela Dai, M. Nießner","doi":"10.1145/3588432.3591566","DOIUrl":null,"url":null,"abstract":"We propose ClipFace, a novel self-supervised approach for text-guided editing of textured 3D morphable model of faces. Specifically, we employ user-friendly language prompts to enable control of the expressions as well as appearance of 3D faces. We leverage the geometric expressiveness of 3D morphable models, which inherently possess limited controllability and texture expressivity, and develop a self-supervised generative model to jointly synthesize expressive, textured, and articulated faces in 3D. We enable high-quality texture generation for 3D faces by adversarial self-supervised training, guided by differentiable rendering against collections of real RGB images. Controllable editing and manipulation are given by language prompts to adapt texture and expression of the 3D morphable model. To this end, we propose a neural network that predicts both texture and expression latent codes of the morphable model. Our model is trained in a self-supervised fashion by exploiting differentiable rendering and losses based on a pre-trained CLIP model. Once trained, our model jointly predicts face textures in UV-space, along with expression parameters to capture both geometry and texture changes in facial expressions in a single forward pass. We further show the applicability of our method to generate temporally changing textures for a given animation sequence.","PeriodicalId":280036,"journal":{"name":"ACM SIGGRAPH 2023 Conference Proceedings","volume":"44 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":"{\"title\":\"ClipFace: Text-guided Editing of Textured 3D Morphable Models\",\"authors\":\"Shivangi Aneja, Justus Thies, Angela Dai, M. Nießner\",\"doi\":\"10.1145/3588432.3591566\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We propose ClipFace, a novel self-supervised approach for text-guided editing of textured 3D morphable model of faces. Specifically, we employ user-friendly language prompts to enable control of the expressions as well as appearance of 3D faces. We leverage the geometric expressiveness of 3D morphable models, which inherently possess limited controllability and texture expressivity, and develop a self-supervised generative model to jointly synthesize expressive, textured, and articulated faces in 3D. We enable high-quality texture generation for 3D faces by adversarial self-supervised training, guided by differentiable rendering against collections of real RGB images. Controllable editing and manipulation are given by language prompts to adapt texture and expression of the 3D morphable model. To this end, we propose a neural network that predicts both texture and expression latent codes of the morphable model. Our model is trained in a self-supervised fashion by exploiting differentiable rendering and losses based on a pre-trained CLIP model. Once trained, our model jointly predicts face textures in UV-space, along with expression parameters to capture both geometry and texture changes in facial expressions in a single forward pass. We further show the applicability of our method to generate temporally changing textures for a given animation sequence.\",\"PeriodicalId\":280036,\"journal\":{\"name\":\"ACM SIGGRAPH 2023 Conference Proceedings\",\"volume\":\"44 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"22\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM SIGGRAPH 2023 Conference Proceedings\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3588432.3591566\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM SIGGRAPH 2023 Conference Proceedings","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3588432.3591566","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 22

摘要

我们提出了ClipFace，一种新颖的自监督方法，用于文本引导编辑纹理三维可变形面部模型。具体来说，我们使用用户友好的语言提示来控制表情以及3D面部的外观。我们利用三维变形模型固有的有限可控性和纹理表现力的几何表现力，开发了一种自监督生成模型，以联合合成具有表现力、纹理和关节的三维人脸。我们通过对抗性的自监督训练，在针对真实RGB图像集合的可微分渲染的指导下，为3D面部生成高质量的纹理。通过语言提示实现可控制的编辑和操作，以适应三维变形模型的纹理和表达。为此，我们提出了一种同时预测变形模型的纹理和表达潜码的神经网络。我们的模型通过利用基于预训练CLIP模型的可微分渲染和损失，以自监督的方式进行训练。经过训练后，我们的模型可以联合预测紫外线空间中的面部纹理，以及表情参数，以捕获单个向前通道中面部表情的几何和纹理变化。我们进一步展示了我们的方法在为给定动画序列生成时间变化纹理方面的适用性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

ClipFace: Text-guided Editing of Textured 3D Morphable Models

We propose ClipFace, a novel self-supervised approach for text-guided editing of textured 3D morphable model of faces. Specifically, we employ user-friendly language prompts to enable control of the expressions as well as appearance of 3D faces. We leverage the geometric expressiveness of 3D morphable models, which inherently possess limited controllability and texture expressivity, and develop a self-supervised generative model to jointly synthesize expressive, textured, and articulated faces in 3D. We enable high-quality texture generation for 3D faces by adversarial self-supervised training, guided by differentiable rendering against collections of real RGB images. Controllable editing and manipulation are given by language prompts to adapt texture and expression of the 3D morphable model. To this end, we propose a neural network that predicts both texture and expression latent codes of the morphable model. Our model is trained in a self-supervised fashion by exploiting differentiable rendering and losses based on a pre-trained CLIP model. Once trained, our model jointly predicts face textures in UV-space, along with expression parameters to capture both geometry and texture changes in facial expressions in a single forward pass. We further show the applicability of our method to generate temporally changing textures for a given animation sequence.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM SIGGRAPH 2023 Conference Proceedings

自引率

0.00%

发文量