Yuxin Wang, Yuanyuan Xie, Xiangmin Ji, Ziao Liu, Xiaolong Liu
{"title":"RacPixGAN: An Enhanced Sketch-to-Face Synthesis GAN Based on Residual modules, Multi-Head Self-Attention Mechanisms, and CLIP Loss","authors":"Yuxin Wang, Yuanyuan Xie, Xiangmin Ji, Ziao Liu, Xiaolong Liu","doi":"10.1109/ICECAI58670.2023.10176715","DOIUrl":null,"url":null,"abstract":"In this paper, we present an enhanced model to overcome the drawbacks of the traditional Pix2pix GAN (Image-to-Image Translation with Conditional Adversarial Networks) in generating performance for sketch-to-face synthesis. This model integrates residual modules and multi-head self-attention mechanisms. Additionally, to enhance the model’s generative capabilities in sketch-to-face synthesis tasks, we introduce a brand-new loss function called CLIP (Contrastive Language-Image Pretraining) Loss. We begin by providing a comprehensive overview of the key theories and techniques for our model. Then, we empirically test the upgraded model and contrast it with the traditional Pix2pix GAN. The experimental outcomes demonstrate that the new model significantly outperforms the traditional Pix2pix GAN in terms of generating performance for sketch-to-face synthesis tasks, supporting the idea that adding residual modules and multi-head self-attention mechanisms can significantly improve the generator’s performance in such tasks. The addition of CLIP Loss has also been shown to improve the quality of image generation.","PeriodicalId":189631,"journal":{"name":"2023 4th International Conference on Electronic Communication and Artificial Intelligence (ICECAI)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 4th International Conference on Electronic Communication and Artificial Intelligence (ICECAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICECAI58670.2023.10176715","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In this paper, we present an enhanced model to overcome the drawbacks of the traditional Pix2pix GAN (Image-to-Image Translation with Conditional Adversarial Networks) in generating performance for sketch-to-face synthesis. This model integrates residual modules and multi-head self-attention mechanisms. Additionally, to enhance the model’s generative capabilities in sketch-to-face synthesis tasks, we introduce a brand-new loss function called CLIP (Contrastive Language-Image Pretraining) Loss. We begin by providing a comprehensive overview of the key theories and techniques for our model. Then, we empirically test the upgraded model and contrast it with the traditional Pix2pix GAN. The experimental outcomes demonstrate that the new model significantly outperforms the traditional Pix2pix GAN in terms of generating performance for sketch-to-face synthesis tasks, supporting the idea that adding residual modules and multi-head self-attention mechanisms can significantly improve the generator’s performance in such tasks. The addition of CLIP Loss has also been shown to improve the quality of image generation.