Mastering Deepfake Detection: A Cutting-Edge Approach to Distinguish GAN and Diffusion-Model Images

IF 6 3区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on Multimedia Computing Communications and Applications Pub Date : 2024-03-09 DOI:10.1145/3652027

Luca Guarnera, Oliver Giudice, Sebastiano Battiato

{"title":"Mastering Deepfake Detection: A Cutting-Edge Approach to Distinguish GAN and Diffusion-Model Images","authors":"Luca Guarnera, Oliver Giudice, Sebastiano Battiato","doi":"10.1145/3652027","DOIUrl":null,"url":null,"abstract":"<p>Detecting and recognizing deepfakes is a pressing issue in the digital age. In this study, we first collected a dataset of pristine images and fake ones properly generated by nine different Generative Adversarial Network (GAN) architectures and four Diffusion Models (DM). The dataset contained a total of 83,000 images, with equal distribution between the real and deepfake data. Then, to address different deepfake detection and recognition tasks, we proposed a hierarchical multi-level approach. At the first level, we classified real images from AI-generated ones. At the second level, we distinguished between images generated by GANs and DMs. At the third level (composed of two additional sub-levels), we recognized the specific GAN and DM architectures used to generate the synthetic data. Experimental results demonstrated that our approach achieved more than 97% classification accuracy, outperforming existing state-of-the-art methods. The models obtained in the different levels turn out to be robust to various attacks such as JPEG compression (with different quality factor values) and resize (and others), demonstrating that the framework can be used and applied in real-world contexts (such as the analysis of multimedia data shared in the various social platforms) for support even in forensic investigations in order to counter the illicit use of these powerful and modern generative models. We are able to identify the specific GAN and DM architecture used to generate the image, which is critical in tracking down the source of the deepfake. Our hierarchical multi-level approach to deepfake detection and recognition shows promising results in identifying deepfakes allowing focus on underlying task by improving (about \\(2\\% \\) on the average) standard multiclass flat detection systems. The proposed method has the potential to enhance the performance of deepfake detection systems, aid in the fight against the spread of fake images, and safeguard the authenticity of digital media.</p>","PeriodicalId":50937,"journal":{"name":"ACM Transactions on Multimedia Computing Communications and Applications","volume":"6 1","pages":""},"PeriodicalIF":6.0000,"publicationDate":"2024-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Multimedia Computing Communications and Applications","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3652027","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Detecting and recognizing deepfakes is a pressing issue in the digital age. In this study, we first collected a dataset of pristine images and fake ones properly generated by nine different Generative Adversarial Network (GAN) architectures and four Diffusion Models (DM). The dataset contained a total of 83,000 images, with equal distribution between the real and deepfake data. Then, to address different deepfake detection and recognition tasks, we proposed a hierarchical multi-level approach. At the first level, we classified real images from AI-generated ones. At the second level, we distinguished between images generated by GANs and DMs. At the third level (composed of two additional sub-levels), we recognized the specific GAN and DM architectures used to generate the synthetic data. Experimental results demonstrated that our approach achieved more than 97% classification accuracy, outperforming existing state-of-the-art methods. The models obtained in the different levels turn out to be robust to various attacks such as JPEG compression (with different quality factor values) and resize (and others), demonstrating that the framework can be used and applied in real-world contexts (such as the analysis of multimedia data shared in the various social platforms) for support even in forensic investigations in order to counter the illicit use of these powerful and modern generative models. We are able to identify the specific GAN and DM architecture used to generate the image, which is critical in tracking down the source of the deepfake. Our hierarchical multi-level approach to deepfake detection and recognition shows promising results in identifying deepfakes allowing focus on underlying task by improving (about \(2\% \) on the average) standard multiclass flat detection systems. The proposed method has the potential to enhance the performance of deepfake detection systems, aid in the fight against the spread of fake images, and safeguard the authenticity of digital media.

查看原文本刊更多论文

掌握深度伪造检测：区分 GAN 和扩散模型图像的尖端方法

检测和识别深度伪造是数字时代的一个紧迫问题。在这项研究中，我们首先收集了由九种不同的生成对抗网络（GAN）架构和四种扩散模型（DM）正确生成的原始图像和假图像数据集。该数据集共包含 83,000 张图像，真实数据和深度伪造数据分布均衡。然后，针对不同的深度伪造检测和识别任务，我们提出了一种分层多级方法。在第一层，我们对真实图像和人工智能生成的图像进行了分类。在第二层，我们区分了由 GAN 和 DM 生成的图像。在第三层（由另外两个子层组成），我们识别了用于生成合成数据的特定 GAN 和 DM 架构。实验结果表明，我们的方法达到了 97% 以上的分类准确率，超过了现有的最先进方法。在不同级别中获得的模型对各种攻击（如 JPEG 压缩（使用不同的质量因子值）和调整大小等）具有鲁棒性，这表明该框架可以在现实世界中使用和应用（如分析在各种社交平台上共享的多媒体数据），甚至可以在取证调查中提供支持，以打击非法使用这些强大的现代生成模型的行为。我们能够识别用于生成图像的特定 GAN 和 DM 架构，这对于追踪 deepfake 的来源至关重要。我们的分层多级深层伪造检测和识别方法在识别深层伪造方面取得了可喜的成果，通过改进（平均约为\(2\%\)）标准多类平面检测系统，使我们能够专注于底层任务。所提出的方法有望提高深度赝品检测系统的性能，帮助打击假图像的传播，并保护数字媒体的真实性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ACM Transactions on Multimedia Computing Communications and Applications 工程技术-计算机：理论方法

CiteScore

8.50

自引率

5.90%

发文量

285

审稿时长

7.5 months

期刊介绍： The ACM Transactions on Multimedia Computing, Communications, and Applications is the flagship publication of the ACM Special Interest Group in Multimedia (SIGMM). It is soliciting paper submissions on all aspects of multimedia. Papers on single media (for instance, audio, video, animation) and their processing are also welcome. TOMM is a peer-reviewed, archival journal, available in both print form and digital form. The Journal is published quarterly; with roughly 7 23-page articles in each issue. In addition, all Special Issues are published online-only to ensure a timely publication. The transactions consists primarily of research papers. This is an archival journal and it is intended that the papers will have lasting importance and value over time. In general, papers whose primary focus is on particular multimedia products or the current state of the industry will not be included.