Double Reference Guided Interactive 2D and 3D Caricature Generation

IF 5.2 3区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on Multimedia Computing Communications and Applications Pub Date : 2024-04-01 DOI:10.1145/3655624

Xin Huang, Dong Liang, Hongrui Cai, Yunfeng Bai, Juyong Zhang, Feng Tian, Jinyuan Jia

{"title":"Double Reference Guided Interactive 2D and 3D Caricature Generation","authors":"Xin Huang, Dong Liang, Hongrui Cai, Yunfeng Bai, Juyong Zhang, Feng Tian, Jinyuan Jia","doi":"10.1145/3655624","DOIUrl":null,"url":null,"abstract":"<p>In this paper, we propose the first geometry and texture (double) referenced interactive 2D and 3D caricature generating and editing method. The main challenge of caricature generation lies in the fact that it not only exaggerates the facial geometry but also refreshes the facial texture. We address this challenge by utilizing the semantic segmentation maps as an intermediary domain, removing the influence of photo texture while preserving the person-specific geometry features. Specifically, our proposed method consists of two main components: 3D-CariNet and CariMaskGAN. 3D-CariNet uses sketches or caricatures to exaggerate the input photo into several types of 3D caricatures. To generate a CariMask, we geometrically exaggerate the photos using the projection of exaggerated 3D landmarks, after which CariMask is converted into a caricature by CariMaskGAN. In this step, users can edit and adjust the geometry of caricatures freely. Moreover, we propose a semantic detail preprocessing approach that considerably increases the details of generated caricatures and allows modification of hair strands, wrinkles, and beards. By rendering high-quality 2D caricatures as textures, we produce 3D caricatures with a variety of texture styles. Extensive experimental results have demonstrated that our method can produce higher-quality caricatures as well as support interactive modification with ease.</p>","PeriodicalId":50937,"journal":{"name":"ACM Transactions on Multimedia Computing Communications and Applications","volume":"79 1","pages":""},"PeriodicalIF":5.2000,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Multimedia Computing Communications and Applications","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3655624","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

In this paper, we propose the first geometry and texture (double) referenced interactive 2D and 3D caricature generating and editing method. The main challenge of caricature generation lies in the fact that it not only exaggerates the facial geometry but also refreshes the facial texture. We address this challenge by utilizing the semantic segmentation maps as an intermediary domain, removing the influence of photo texture while preserving the person-specific geometry features. Specifically, our proposed method consists of two main components: 3D-CariNet and CariMaskGAN. 3D-CariNet uses sketches or caricatures to exaggerate the input photo into several types of 3D caricatures. To generate a CariMask, we geometrically exaggerate the photos using the projection of exaggerated 3D landmarks, after which CariMask is converted into a caricature by CariMaskGAN. In this step, users can edit and adjust the geometry of caricatures freely. Moreover, we propose a semantic detail preprocessing approach that considerably increases the details of generated caricatures and allows modification of hair strands, wrinkles, and beards. By rendering high-quality 2D caricatures as textures, we produce 3D caricatures with a variety of texture styles. Extensive experimental results have demonstrated that our method can produce higher-quality caricatures as well as support interactive modification with ease.

查看原文本刊更多论文

双重参照指导下的交互式二维和三维漫画生成

在本文中，我们首次提出了几何和纹理（双重）参考的交互式二维和三维漫画生成和编辑方法。漫画生成的主要挑战在于，它不仅要夸大面部几何图形，还要刷新面部纹理。我们利用语义分割图作为中间域，消除了照片纹理的影响，同时保留了特定人物的几何特征，从而解决了这一难题。具体来说，我们提出的方法由两个主要部分组成：3D-CariNet 和 CariMaskGAN。3D-CariNet 使用草图或漫画将输入照片夸张成多种类型的 3D 漫画。为了生成 CariMask，我们使用夸张三维地标的投影对照片进行几何夸张，然后通过 CariMaskGAN 将 CariMask 转换为漫画。在这一步骤中，用户可以自由编辑和调整漫画的几何形状。此外，我们还提出了一种语义细节预处理方法，可大大增加生成的漫画的细节，并允许修改发丝、皱纹和胡须。通过将高质量的二维漫画渲染为纹理，我们生成了具有各种纹理风格的三维漫画。广泛的实验结果表明，我们的方法可以制作出更高质量的漫画，并能轻松支持交互式修改。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ACM Transactions on Multimedia Computing Communications and Applications 工程技术-计算机：理论方法

CiteScore

8.50

自引率

5.90%

发文量

285

审稿时长

7.5 months

期刊介绍： The ACM Transactions on Multimedia Computing, Communications, and Applications is the flagship publication of the ACM Special Interest Group in Multimedia (SIGMM). It is soliciting paper submissions on all aspects of multimedia. Papers on single media (for instance, audio, video, animation) and their processing are also welcome. TOMM is a peer-reviewed, archival journal, available in both print form and digital form. The Journal is published quarterly; with roughly 7 23-page articles in each issue. In addition, all Special Issues are published online-only to ensure a timely publication. The transactions consists primarily of research papers. This is an archival journal and it is intended that the papers will have lasting importance and value over time. In general, papers whose primary focus is on particular multimedia products or the current state of the industry will not be included.