GAN-ViT-CMFD: A novel framework integrating generative adversarial networks and vision transformers for enhanced copy-move forgery detection and classification with spectral clustering

Intelligent Systems with Applications Pub Date : 2025-04-17 DOI:10.1016/j.iswa.2025.200524

Jyothsna Ravula, Nilu Singh

{"title":"GAN-ViT-CMFD: A novel framework integrating generative adversarial networks and vision transformers for enhanced copy-move forgery detection and classification with spectral clustering","authors":"Jyothsna Ravula, Nilu Singh","doi":"10.1016/j.iswa.2025.200524","DOIUrl":null,"url":null,"abstract":"<div><div>Copy-move forgery detection (CMFD) is a critical task in digital forensics to ensure the authenticity of visual content, as the prevalence of advanced editing tools has made it increasingly easy to tamper with images. Such forgeries can have severe implications in fields like journalism, legal evidence, and cybersecurity. The motivation for adopting a hybrid Generative Adversarial Network (GAN)-Vision Transformer (ViT) approach arises from the need for robust models capable of handling the complexities of forgery patterns while ensuring high detection accuracy. This study proposes a hybrid framework, GAN-ViT-CMFD, integrating GANs and ViTs to address these challenges. GANs are employed to generate realistic forged images, creating an augmented dataset that enhances model robustness. ViTs extract powerful feature representations, leveraging their competence to capture long-range dependencies and intricate patterns in image data. Spectral clustering is then applied to the feature space to segregate forged and original image features, which are subsequently fed into a Convolutional Neural Network (CNN)-based classifier for forgery detection and classification.</div><div>The proposed model demonstrates superior performance, achieving a training accuracy of 99.62 % and a validation accuracy of 99.0 %, with training and validation losses of 0.0352 and 0.0269, respectively. Evaluation metrics further affirm its effectiveness, with an accuracy of 99.02 %, precision of 97.92 %, recall of 99.89 %, and F1-score of 98.95 %. Additionally, the model achieves an exceptional ROC-AUC score of 99.88 %. These outcomes emphasize the ability of the GAN-ViT method in CMFD, highlighting its potential impact in reinforcing the reliability of image authenticity verification across various domains.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"26 ","pages":"Article 200524"},"PeriodicalIF":0.0000,"publicationDate":"2025-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligent Systems with Applications","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S266730532500050X","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Copy-move forgery detection (CMFD) is a critical task in digital forensics to ensure the authenticity of visual content, as the prevalence of advanced editing tools has made it increasingly easy to tamper with images. Such forgeries can have severe implications in fields like journalism, legal evidence, and cybersecurity. The motivation for adopting a hybrid Generative Adversarial Network (GAN)-Vision Transformer (ViT) approach arises from the need for robust models capable of handling the complexities of forgery patterns while ensuring high detection accuracy. This study proposes a hybrid framework, GAN-ViT-CMFD, integrating GANs and ViTs to address these challenges. GANs are employed to generate realistic forged images, creating an augmented dataset that enhances model robustness. ViTs extract powerful feature representations, leveraging their competence to capture long-range dependencies and intricate patterns in image data. Spectral clustering is then applied to the feature space to segregate forged and original image features, which are subsequently fed into a Convolutional Neural Network (CNN)-based classifier for forgery detection and classification.

The proposed model demonstrates superior performance, achieving a training accuracy of 99.62 % and a validation accuracy of 99.0 %, with training and validation losses of 0.0352 and 0.0269, respectively. Evaluation metrics further affirm its effectiveness, with an accuracy of 99.02 %, precision of 97.92 %, recall of 99.89 %, and F1-score of 98.95 %. Additionally, the model achieves an exceptional ROC-AUC score of 99.88 %. These outcomes emphasize the ability of the GAN-ViT method in CMFD, highlighting its potential impact in reinforcing the reliability of image authenticity verification across various domains.

查看原文本刊更多论文

GAN-ViT-CMFD：一个集成生成对抗网络和视觉转换器的新框架，用于增强复制-移动伪造检测和光谱聚类分类

复制-移动伪造检测（CMFD）是数字取证中的一项关键任务，以确保视觉内容的真实性，因为先进编辑工具的普及使得篡改图像变得越来越容易。这种伪造可能会对新闻、法律证据和网络安全等领域产生严重影响。采用混合生成对抗网络(GAN)-视觉变压器（ViT）方法的动机源于对鲁棒模型的需求，该模型能够处理伪造模式的复杂性，同时确保高检测精度。本研究提出了一个混合框架，GAN-ViT-CMFD，整合gan和vit来应对这些挑战。采用gan生成真实的伪造图像，创建增强数据集，增强模型鲁棒性。vit提取强大的特征表示，利用其能力捕获图像数据中的远程依赖关系和复杂模式。然后将光谱聚类应用于特征空间来分离伪造和原始图像特征，随后将其输入基于卷积神经网络（CNN）的分类器进行伪造检测和分类。该模型的训练准确率为99.62%，验证准确率为99.0%，训练和验证损失分别为0.0352和0.0269。评价指标进一步肯定了其有效性，准确率为99.02%，精密度为97.92%，召回率为99.89%，f1评分为98.95%。此外，该模型的ROC-AUC得分达到了99.88%。这些结果强调了GAN-ViT方法在CMFD中的能力，强调了其在加强跨各个领域图像真实性验证可靠性方面的潜在影响。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Intelligent Systems with Applications

CiteScore

5.60

自引率

0.00%

发文量