基于对比学习和光谱滤波器轮廓的图像生成改进（STIG）频谱转换

ArXiv Pub Date : 2024-03-08 DOI:10.1609/aaai.v38i4.28074

Seokjun Lee, Seung-Won Jung, Hyunseok Seo

{"title":"基于对比学习和光谱滤波器轮廓的图像生成改进（STIG）频谱转换","authors":"Seokjun Lee, Seung-Won Jung, Hyunseok Seo","doi":"10.1609/aaai.v38i4.28074","DOIUrl":null,"url":null,"abstract":"Currently, image generation and synthesis have remarkably progressed with generative models. Despite photo-realistic results, intrinsic discrepancies are still observed in the frequency domain. The spectral discrepancy appeared not only in generative adversarial networks but in diffusion models. In this study, we propose a framework to effectively mitigate the disparity in frequency domain of the generated images to improve generative performance of both GAN and diffusion models. This is realized by spectrum translation for the refinement of image generation (STIG) based on contrastive learning. We adopt theoretical logic of frequency components in various generative networks. The key idea, here, is to refine the spectrum of the generated image via the concept of image-to-image translation and contrastive learning in terms of digital signal processing. We evaluate our framework across eight fake image datasets and various cutting-edge models to demonstrate the effectiveness of STIG. Our framework outperforms other cutting-edges showing significant decreases in FID and log frequency distance of spectrum. We further emphasize that STIG improves image quality by decreasing the spectral anomaly. Additionally, validation results present that the frequency-based deepfake detector confuses more in the case where fake spectrums are manipulated by STIG.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"31 52","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Spectrum Translation for Refinement of Image Generation (STIG) Based on Contrastive Learning and Spectral Filter Profile\",\"authors\":\"Seokjun Lee, Seung-Won Jung, Hyunseok Seo\",\"doi\":\"10.1609/aaai.v38i4.28074\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Currently, image generation and synthesis have remarkably progressed with generative models. Despite photo-realistic results, intrinsic discrepancies are still observed in the frequency domain. The spectral discrepancy appeared not only in generative adversarial networks but in diffusion models. In this study, we propose a framework to effectively mitigate the disparity in frequency domain of the generated images to improve generative performance of both GAN and diffusion models. This is realized by spectrum translation for the refinement of image generation (STIG) based on contrastive learning. We adopt theoretical logic of frequency components in various generative networks. The key idea, here, is to refine the spectrum of the generated image via the concept of image-to-image translation and contrastive learning in terms of digital signal processing. We evaluate our framework across eight fake image datasets and various cutting-edge models to demonstrate the effectiveness of STIG. Our framework outperforms other cutting-edges showing significant decreases in FID and log frequency distance of spectrum. We further emphasize that STIG improves image quality by decreasing the spectral anomaly. Additionally, validation results present that the frequency-based deepfake detector confuses more in the case where fake spectrums are manipulated by STIG.\",\"PeriodicalId\":513202,\"journal\":{\"name\":\"ArXiv\",\"volume\":\"31 52\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-03-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ArXiv\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1609/aaai.v38i4.28074\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ArXiv","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1609/aaai.v38i4.28074","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

目前，图像生成和合成在生成模型方面取得了显著进展。尽管取得了逼真的效果，但在频域中仍可观察到内在差异。频谱差异不仅出现在生成式对抗网络中，也出现在扩散模型中。在本研究中，我们提出了一个框架，以有效缓解生成图像在频域上的差异，从而提高生成式对抗网络和扩散模型的生成性能。这是通过基于对比学习的图像生成细化频谱转换（STIG）来实现的。我们采用了各种生成网络中频率成分的理论逻辑。这里的关键思路是通过图像到图像的转换概念和数字信号处理方面的对比学习来完善生成图像的频谱。我们通过八个假图像数据集和各种尖端模型对我们的框架进行了评估，以证明 STIG 的有效性。我们的框架在 FID 和频谱对数频率距离方面的表现明显优于其他前沿模型。我们进一步强调，STIG 通过减少频谱异常来提高图像质量。此外，验证结果表明，在 STIG 处理伪造频谱的情况下，基于频率的深度伪造检测器会产生更多混淆。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Spectrum Translation for Refinement of Image Generation (STIG) Based on Contrastive Learning and Spectral Filter Profile

Currently, image generation and synthesis have remarkably progressed with generative models. Despite photo-realistic results, intrinsic discrepancies are still observed in the frequency domain. The spectral discrepancy appeared not only in generative adversarial networks but in diffusion models. In this study, we propose a framework to effectively mitigate the disparity in frequency domain of the generated images to improve generative performance of both GAN and diffusion models. This is realized by spectrum translation for the refinement of image generation (STIG) based on contrastive learning. We adopt theoretical logic of frequency components in various generative networks. The key idea, here, is to refine the spectrum of the generated image via the concept of image-to-image translation and contrastive learning in terms of digital signal processing. We evaluate our framework across eight fake image datasets and various cutting-edge models to demonstrate the effectiveness of STIG. Our framework outperforms other cutting-edges showing significant decreases in FID and log frequency distance of spectrum. We further emphasize that STIG improves image quality by decreasing the spectral anomaly. Additionally, validation results present that the frequency-based deepfake detector confuses more in the case where fake spectrums are manipulated by STIG.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ArXiv

自引率

0.00%

发文量