基于 GAN 的多分解照片卡通化

IF 0.9 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Animation and Virtual Worlds Pub Date : 2024-05-23 DOI:10.1002/cav.2248

Wenqing Zhao, Jianlin Zhu, Jin Huang, Ping Li, Bin Sheng

{"title":"基于 GAN 的多分解照片卡通化","authors":"Wenqing Zhao, Jianlin Zhu, Jin Huang, Ping Li, Bin Sheng","doi":"10.1002/cav.2248","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Background</h3>\n \n <p>Cartoon images play a vital role in film production, scientific and educational animation, video games, and other fields, and are one of the key visual expressions of artistic creation. However, since hand-crafted cartoon images often require a great deal of time and effort on the part of professional artists, it is necessary to be able to automatically transform real-world images into different styles of cartoon images. Although cartoon images vary from artist to artist, cartoon images generally have the unique characteristics of being highly simplified and abstract, with clear edges, smooth color shading, and relatively simple textures. However, existing image cartoonization methods tend to create a number of problems when performing style transfer, which mainly include: (1) the resulting generated images do not have obvious cartoon-style textures; and (2) the generated images are prone to structural confusion, color artifacts, and loss of the original image content. Therefore, it is also a great challenge in the field of image cartoonization to be able to make a good balance between style transfer and content keeping.</p>\n </section>\n \n <section>\n \n <h3> Methods</h3>\n \n <p>In this paper, we propose a GAN-based multi-attention mechanism for image cartoonization to address the above issues. The method combines the residual block used to extract deep network features in the generator with the attention mechanism, and further strengthens the perceptual ability of the generative model to cartoon images through the adaptive feature correction of the attention module to improve the cartoon features of the generated images. At the same time, we also introduce the attention mechanism in the convolution block of the discriminator, which is used to further reduce the image visual quality problem caused by the style transfer process. By introducing the attention mechanism into the generator and discriminator models of the generative adversarial network, our method enables the generated images to have obvious cartoon-style features while effectively improving the image's visual quality.</p>\n </section>\n \n <section>\n \n <h3> Results</h3>\n \n <p>A large number of quantitative, qualitative, and ablation experiments are conducted to demonstrate the advantages of our method in the field of image cartoonization and the role of each module in the method.</p>\n </section>\n </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 3","pages":""},"PeriodicalIF":0.9000,"publicationDate":"2024-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"GAN-Based Multi-Decomposition Photo Cartoonization\",\"authors\":\"Wenqing Zhao, Jianlin Zhu, Jin Huang, Ping Li, Bin Sheng\",\"doi\":\"10.1002/cav.2248\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n \\n <section>\\n \\n <h3> Background</h3>\\n \\n <p>Cartoon images play a vital role in film production, scientific and educational animation, video games, and other fields, and are one of the key visual expressions of artistic creation. However, since hand-crafted cartoon images often require a great deal of time and effort on the part of professional artists, it is necessary to be able to automatically transform real-world images into different styles of cartoon images. Although cartoon images vary from artist to artist, cartoon images generally have the unique characteristics of being highly simplified and abstract, with clear edges, smooth color shading, and relatively simple textures. However, existing image cartoonization methods tend to create a number of problems when performing style transfer, which mainly include: (1) the resulting generated images do not have obvious cartoon-style textures; and (2) the generated images are prone to structural confusion, color artifacts, and loss of the original image content. Therefore, it is also a great challenge in the field of image cartoonization to be able to make a good balance between style transfer and content keeping.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Methods</h3>\\n \\n <p>In this paper, we propose a GAN-based multi-attention mechanism for image cartoonization to address the above issues. The method combines the residual block used to extract deep network features in the generator with the attention mechanism, and further strengthens the perceptual ability of the generative model to cartoon images through the adaptive feature correction of the attention module to improve the cartoon features of the generated images. At the same time, we also introduce the attention mechanism in the convolution block of the discriminator, which is used to further reduce the image visual quality problem caused by the style transfer process. By introducing the attention mechanism into the generator and discriminator models of the generative adversarial network, our method enables the generated images to have obvious cartoon-style features while effectively improving the image's visual quality.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Results</h3>\\n \\n <p>A large number of quantitative, qualitative, and ablation experiments are conducted to demonstrate the advantages of our method in the field of image cartoonization and the role of each module in the method.</p>\\n </section>\\n </div>\",\"PeriodicalId\":50645,\"journal\":{\"name\":\"Computer Animation and Virtual Worlds\",\"volume\":\"35 3\",\"pages\":\"\"},\"PeriodicalIF\":0.9000,\"publicationDate\":\"2024-05-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Animation and Virtual Worlds\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/cav.2248\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Animation and Virtual Worlds","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cav.2248","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

摘要

背景卡通图像在电影制作、科教动画、视频游戏等领域发挥着重要作用，是艺术创作的重要视觉表现形式之一。然而，由于手工制作的卡通形象往往需要专业艺术家花费大量的时间和精力，因此有必要将现实世界的图像自动转换成不同风格的卡通形象。虽然不同艺术家的卡通形象各不相同，但卡通形象一般都具有高度简化和抽象、边缘清晰、色调平滑和纹理相对简单的独特特征。然而，现有的图像卡通化方法在进行风格转换时往往会产生一些问题，主要包括(1) 生成的图像没有明显的卡通风格纹理；(2) 生成的图像容易出现结构混乱、色彩伪造和原始图像内容丢失等问题。因此，如何在风格传递和内容保持之间取得良好的平衡也是图像卡通化领域的一大挑战。方法本文针对上述问题，提出了一种基于 GAN 的图像卡通化多注意力机制。该方法将生成器中用于提取深度网络特征的残差块与注意力机制相结合，并通过注意力模块的自适应特征校正进一步加强生成模型对卡通图像的感知能力，从而改善生成图像的卡通特征。同时，我们还在鉴别器的卷积块中引入了注意力机制，用于进一步降低风格转移过程所带来的图像视觉质量问题。通过在生成式对抗网络的生成器和判别器模型中引入注意力机制，我们的方法使生成的图像具有明显的卡通风格特征，同时有效改善了图像的视觉质量。结果通过大量的定量、定性和消融实验，证明了我们的方法在图像卡通化领域的优势以及各个模块在方法中的作用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

GAN-Based Multi-Decomposition Photo Cartoonization

Background

Cartoon images play a vital role in film production, scientific and educational animation, video games, and other fields, and are one of the key visual expressions of artistic creation. However, since hand-crafted cartoon images often require a great deal of time and effort on the part of professional artists, it is necessary to be able to automatically transform real-world images into different styles of cartoon images. Although cartoon images vary from artist to artist, cartoon images generally have the unique characteristics of being highly simplified and abstract, with clear edges, smooth color shading, and relatively simple textures. However, existing image cartoonization methods tend to create a number of problems when performing style transfer, which mainly include: (1) the resulting generated images do not have obvious cartoon-style textures; and (2) the generated images are prone to structural confusion, color artifacts, and loss of the original image content. Therefore, it is also a great challenge in the field of image cartoonization to be able to make a good balance between style transfer and content keeping.

Methods

In this paper, we propose a GAN-based multi-attention mechanism for image cartoonization to address the above issues. The method combines the residual block used to extract deep network features in the generator with the attention mechanism, and further strengthens the perceptual ability of the generative model to cartoon images through the adaptive feature correction of the attention module to improve the cartoon features of the generated images. At the same time, we also introduce the attention mechanism in the convolution block of the discriminator, which is used to further reduce the image visual quality problem caused by the style transfer process. By introducing the attention mechanism into the generator and discriminator models of the generative adversarial network, our method enables the generated images to have obvious cartoon-style features while effectively improving the image's visual quality.

Results

A large number of quantitative, qualitative, and ablation experiments are conducted to demonstrate the advantages of our method in the field of image cartoonization and the role of each module in the method.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computer Animation and Virtual Worlds 工程技术-计算机：软件工程

CiteScore

2.20

自引率

0.00%

发文量

审稿时长

6-12 weeks

期刊介绍： With the advent of very powerful PCs and high-end graphics cards, there has been an incredible development in Virtual Worlds, real-time computer animation and simulation, games. But at the same time, new and cheaper Virtual Reality devices have appeared allowing an interaction with these real-time Virtual Worlds and even with real worlds through Augmented Reality. Three-dimensional characters, especially Virtual Humans are now of an exceptional quality, which allows to use them in the movie industry. But this is only a beginning, as with the development of Artificial Intelligence and Agent technology, these characters will become more and more autonomous and even intelligent. They will inhabit the Virtual Worlds in a Virtual Life together with animals and plants.