GAN-ViT-CMFD: A novel framework integrating generative adversarial networks and vision transformers for enhanced copy-move forgery detection and classification with spectral clustering
{"title":"GAN-ViT-CMFD: A novel framework integrating generative adversarial networks and vision transformers for enhanced copy-move forgery detection and classification with spectral clustering","authors":"Jyothsna Ravula, Nilu Singh","doi":"10.1016/j.iswa.2025.200524","DOIUrl":null,"url":null,"abstract":"<div><div>Copy-move forgery detection (CMFD) is a critical task in digital forensics to ensure the authenticity of visual content, as the prevalence of advanced editing tools has made it increasingly easy to tamper with images. Such forgeries can have severe implications in fields like journalism, legal evidence, and cybersecurity. The motivation for adopting a hybrid Generative Adversarial Network (GAN)-Vision Transformer (ViT) approach arises from the need for robust models capable of handling the complexities of forgery patterns while ensuring high detection accuracy. This study proposes a hybrid framework, GAN-ViT-CMFD, integrating GANs and ViTs to address these challenges. GANs are employed to generate realistic forged images, creating an augmented dataset that enhances model robustness. ViTs extract powerful feature representations, leveraging their competence to capture long-range dependencies and intricate patterns in image data. Spectral clustering is then applied to the feature space to segregate forged and original image features, which are subsequently fed into a Convolutional Neural Network (CNN)-based classifier for forgery detection and classification.</div><div>The proposed model demonstrates superior performance, achieving a training accuracy of 99.62 % and a validation accuracy of 99.0 %, with training and validation losses of 0.0352 and 0.0269, respectively. Evaluation metrics further affirm its effectiveness, with an accuracy of 99.02 %, precision of 97.92 %, recall of 99.89 %, and F1-score of 98.95 %. Additionally, the model achieves an exceptional ROC-AUC score of 99.88 %. These outcomes emphasize the ability of the GAN-ViT method in CMFD, highlighting its potential impact in reinforcing the reliability of image authenticity verification across various domains.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"26 ","pages":"Article 200524"},"PeriodicalIF":0.0000,"publicationDate":"2025-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligent Systems with Applications","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S266730532500050X","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Copy-move forgery detection (CMFD) is a critical task in digital forensics to ensure the authenticity of visual content, as the prevalence of advanced editing tools has made it increasingly easy to tamper with images. Such forgeries can have severe implications in fields like journalism, legal evidence, and cybersecurity. The motivation for adopting a hybrid Generative Adversarial Network (GAN)-Vision Transformer (ViT) approach arises from the need for robust models capable of handling the complexities of forgery patterns while ensuring high detection accuracy. This study proposes a hybrid framework, GAN-ViT-CMFD, integrating GANs and ViTs to address these challenges. GANs are employed to generate realistic forged images, creating an augmented dataset that enhances model robustness. ViTs extract powerful feature representations, leveraging their competence to capture long-range dependencies and intricate patterns in image data. Spectral clustering is then applied to the feature space to segregate forged and original image features, which are subsequently fed into a Convolutional Neural Network (CNN)-based classifier for forgery detection and classification.
The proposed model demonstrates superior performance, achieving a training accuracy of 99.62 % and a validation accuracy of 99.0 %, with training and validation losses of 0.0352 and 0.0269, respectively. Evaluation metrics further affirm its effectiveness, with an accuracy of 99.02 %, precision of 97.92 %, recall of 99.89 %, and F1-score of 98.95 %. Additionally, the model achieves an exceptional ROC-AUC score of 99.88 %. These outcomes emphasize the ability of the GAN-ViT method in CMFD, highlighting its potential impact in reinforcing the reliability of image authenticity verification across various domains.