SMFS-GAN: Style-Guided Multi-class Freehand Sketch-to-Image Synthesis

IF 2.7 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Graphics Forum Pub Date : 2024-08-07 DOI:10.1111/cgf.15190

Zhenwei Cheng, Lei Wu, Xiang Li, Xiangxu Meng

{"title":"SMFS-GAN: Style-Guided Multi-class Freehand Sketch-to-Image Synthesis","authors":"Zhenwei Cheng, Lei Wu, Xiang Li, Xiangxu Meng","doi":"10.1111/cgf.15190","DOIUrl":null,"url":null,"abstract":"<p>Freehand sketch-to-image (S2I) is a challenging task due to the individualized lines and the random shape of freehand sketches. The multi-class freehand sketch-to-image synthesis task, in turn, presents new challenges for this research area. This task requires not only the consideration of the problems posed by freehand sketches but also the analysis of multi-class domain differences in the conditions of a single model. However, existing methods often have difficulty learning domain differences between multiple classes, and cannot generate controllable and appropriate textures while maintaining shape stability. In this paper, we propose a style-guided multi-class freehand sketch-to-image synthesis model, SMFS-GAN, which can be trained using only unpaired data. To this end, we introduce a contrast-based style encoder that optimizes the network's perception of domain disparities by explicitly modelling the differences between classes and thus extracting style information across domains. Further, to optimize the fine-grained texture of the generated results and the shape consistency with freehand sketches, we propose a local texture refinement discriminator and a Shape Constraint Module, respectively. In addition, to address the imbalance of data classes in the QMUL-Sketch dataset, we add 6K images by drawing manually and obtain QMUL-Sketch+ dataset. Extensive experiments on SketchyCOCO Object dataset, QMUL-Sketch+ dataset and Pseudosketches dataset demonstrate the effectiveness as well as the superiority of our proposed method.</p>","PeriodicalId":10687,"journal":{"name":"Computer Graphics Forum","volume":"43 6","pages":""},"PeriodicalIF":2.7000,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Graphics Forum","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/cgf.15190","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

Freehand sketch-to-image (S2I) is a challenging task due to the individualized lines and the random shape of freehand sketches. The multi-class freehand sketch-to-image synthesis task, in turn, presents new challenges for this research area. This task requires not only the consideration of the problems posed by freehand sketches but also the analysis of multi-class domain differences in the conditions of a single model. However, existing methods often have difficulty learning domain differences between multiple classes, and cannot generate controllable and appropriate textures while maintaining shape stability. In this paper, we propose a style-guided multi-class freehand sketch-to-image synthesis model, SMFS-GAN, which can be trained using only unpaired data. To this end, we introduce a contrast-based style encoder that optimizes the network's perception of domain disparities by explicitly modelling the differences between classes and thus extracting style information across domains. Further, to optimize the fine-grained texture of the generated results and the shape consistency with freehand sketches, we propose a local texture refinement discriminator and a Shape Constraint Module, respectively. In addition, to address the imbalance of data classes in the QMUL-Sketch dataset, we add 6K images by drawing manually and obtain QMUL-Sketch+ dataset. Extensive experiments on SketchyCOCO Object dataset, QMUL-Sketch+ dataset and Pseudosketches dataset demonstrate the effectiveness as well as the superiority of our proposed method.

Abstract Image

查看原文本刊更多论文

SMFS-GAN：风格引导的多类自由草图到图像合成

自由手绘草图到图像（S2I）是一项具有挑战性的任务，因为自由手绘草图具有个性化的线条和随机的形状。反过来，多类自由手绘草图到图像的合成任务也为这一研究领域带来了新的挑战。这项任务不仅需要考虑自由手绘草图带来的问题，还需要分析单一模型条件下的多类领域差异。然而，现有的方法往往难以学习多个类别之间的领域差异，无法在保持形状稳定性的同时生成可控的适当纹理。在本文中，我们提出了一种风格引导的多类自由手绘素描到图像合成模型 SMFS-GAN，该模型只需使用非配对数据即可进行训练。为此，我们引入了基于对比度的风格编码器，通过明确模拟类之间的差异来优化网络对域差异的感知，从而提取跨域的风格信息。此外，为了优化生成结果的精细纹理以及与自由手绘草图的形状一致性，我们分别提出了局部纹理细化判别器和形状约束模块。此外，针对 QMUL-Sketch 数据集中数据类别不平衡的问题，我们增加了 6K 张手工绘制的图像，得到了 QMUL-Sketch+ 数据集。在 SketchyCOCO Object 数据集、QMUL-Sketch+ 数据集和 Pseudosetches 数据集上进行的大量实验证明了我们提出的方法的有效性和优越性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computer Graphics Forum 工程技术-计算机：软件工程

CiteScore

5.80

自引率

12.00%

发文量

175

审稿时长

3-6 weeks

期刊介绍： Computer Graphics Forum is the official journal of Eurographics, published in cooperation with Wiley-Blackwell, and is a unique, international source of information for computer graphics professionals interested in graphics developments worldwide. It is now one of the leading journals for researchers, developers and users of computer graphics in both commercial and academic environments. The journal reports on the latest developments in the field throughout the world and covers all aspects of the theory, practice and application of computer graphics.