使用优化的基于vgg16的框架进行深度伪造检测，并增强了LIME以确保数字内容的安全性

IF 4.2 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Image and Vision Computing Pub Date : 2025-08-28 DOI:10.1016/j.imavis.2025.105696

Asma Aldrees , Nihal Abuzinadah , Muhammad Umer , Dina Abdulaziz AlHammadi , Shtwai Alsubai , Raed Alharthi

{"title":"使用优化的基于vgg16的框架进行深度伪造检测，并增强了LIME以确保数字内容的安全性","authors":"Asma Aldrees , Nihal Abuzinadah , Muhammad Umer , Dina Abdulaziz AlHammadi , Shtwai Alsubai , Raed Alharthi","doi":"10.1016/j.imavis.2025.105696","DOIUrl":null,"url":null,"abstract":"<div><div>The rapid evolution of technologies to manipulate facial images, namely Generative Adversarial Networks (GANs) and those based on Stable Diffusion, has increased the need for effective deepfake detection mechanisms to mitigate their misuse. In this paper, the critical challenge of detecting deepfake images is addressed through a new deep learning-based approach that uses the VGG16 model after applying all necessary preprocessing steps. The VGG16 architecture was chosen for its deep structure and strong ability to capture intricate facial patterns when classifying facial images as real or manipulated. A robust preprocessing pipeline — including normalization, augmentation, facial alignment, and noise reduction — was implemented to optimize input data, improving the detection of subtle manipulations. Additionally, Explainable AI (XAI) techniques, such as the Local Interpretable Model-agnostic Explanations (LIME) framework, were integrated to provide transparent, visual explanations of the model’s predictions, enhancing interpretability and user trust. To further assess generalizability, the evaluation was extended beyond the initial dataset by incorporating three additional benchmark datasets: FaceForensics++, Celeb-DF (v2), and the DFDC Preview Set. These datasets contain a range of manipulation techniques, allowing for comprehensive testing of the model’s robustness across different scenarios. The proposed method outperformed baselines with exceptional performance metrics (accuracy, precision, recall, and F1-score up to 0.99), and maintained strong results across different datasets. These findings demonstrate that combining XAI approaches with a VGG16 model and thorough preprocessing effectively counters advanced deepfake generation techniques, such as StyleGAN2. This research contributes to a safer digital landscape by improving the detection and understanding of manipulated content, providing a practical way to confront the growing threat of deepfakes.</div></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":"162 ","pages":"Article 105696"},"PeriodicalIF":4.2000,"publicationDate":"2025-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deepfake detection using optimized VGG16-based framework enhanced with LIME for secure digital content\",\"authors\":\"Asma Aldrees , Nihal Abuzinadah , Muhammad Umer , Dina Abdulaziz AlHammadi , Shtwai Alsubai , Raed Alharthi\",\"doi\":\"10.1016/j.imavis.2025.105696\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The rapid evolution of technologies to manipulate facial images, namely Generative Adversarial Networks (GANs) and those based on Stable Diffusion, has increased the need for effective deepfake detection mechanisms to mitigate their misuse. In this paper, the critical challenge of detecting deepfake images is addressed through a new deep learning-based approach that uses the VGG16 model after applying all necessary preprocessing steps. The VGG16 architecture was chosen for its deep structure and strong ability to capture intricate facial patterns when classifying facial images as real or manipulated. A robust preprocessing pipeline — including normalization, augmentation, facial alignment, and noise reduction — was implemented to optimize input data, improving the detection of subtle manipulations. Additionally, Explainable AI (XAI) techniques, such as the Local Interpretable Model-agnostic Explanations (LIME) framework, were integrated to provide transparent, visual explanations of the model’s predictions, enhancing interpretability and user trust. To further assess generalizability, the evaluation was extended beyond the initial dataset by incorporating three additional benchmark datasets: FaceForensics++, Celeb-DF (v2), and the DFDC Preview Set. These datasets contain a range of manipulation techniques, allowing for comprehensive testing of the model’s robustness across different scenarios. The proposed method outperformed baselines with exceptional performance metrics (accuracy, precision, recall, and F1-score up to 0.99), and maintained strong results across different datasets. These findings demonstrate that combining XAI approaches with a VGG16 model and thorough preprocessing effectively counters advanced deepfake generation techniques, such as StyleGAN2. This research contributes to a safer digital landscape by improving the detection and understanding of manipulated content, providing a practical way to confront the growing threat of deepfakes.</div></div>\",\"PeriodicalId\":50374,\"journal\":{\"name\":\"Image and Vision Computing\",\"volume\":\"162 \",\"pages\":\"Article 105696\"},\"PeriodicalIF\":4.2000,\"publicationDate\":\"2025-08-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Image and Vision Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0262885625002847\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Image and Vision Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0262885625002847","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

面部图像处理技术的快速发展，即生成对抗网络（GANs）和基于稳定扩散的技术，增加了对有效深度假检测机制的需求，以减少其误用。在本文中，通过一种新的基于深度学习的方法来解决检测深度假图像的关键挑战，该方法在应用所有必要的预处理步骤后使用VGG16模型。选择VGG16架构是因为其深层结构和在将面部图像分类为真实或操纵时捕获复杂面部图案的强大能力。一个强大的预处理管道——包括归一化、增强、面部对齐和降噪——被用于优化输入数据，提高对细微操作的检测。此外，可解释的人工智能（XAI）技术，如局部可解释模型不可知论解释（LIME）框架，被集成为模型预测的透明、可视化解释，增强了可解释性和用户信任。为了进一步评估概括性，我们在初始数据集的基础上扩展了评估，纳入了三个额外的基准数据集：facefrensics ++、Celeb-DF （v2）和DFDC预览集。这些数据集包含一系列操作技术，允许在不同场景中对模型的鲁棒性进行全面测试。所提出的方法在性能指标（准确度、精密度、召回率和f1得分高达0.99）上优于基线，并在不同的数据集上保持了良好的结果。这些发现表明，将XAI方法与VGG16模型和彻底的预处理相结合，可以有效地对抗先进的深度生成技术，如StyleGAN2。这项研究通过提高对被操纵内容的检测和理解，为更安全的数字环境做出了贡献，为应对日益严重的深度伪造威胁提供了一种实用的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Deepfake detection using optimized VGG16-based framework enhanced with LIME for secure digital content

The rapid evolution of technologies to manipulate facial images, namely Generative Adversarial Networks (GANs) and those based on Stable Diffusion, has increased the need for effective deepfake detection mechanisms to mitigate their misuse. In this paper, the critical challenge of detecting deepfake images is addressed through a new deep learning-based approach that uses the VGG16 model after applying all necessary preprocessing steps. The VGG16 architecture was chosen for its deep structure and strong ability to capture intricate facial patterns when classifying facial images as real or manipulated. A robust preprocessing pipeline — including normalization, augmentation, facial alignment, and noise reduction — was implemented to optimize input data, improving the detection of subtle manipulations. Additionally, Explainable AI (XAI) techniques, such as the Local Interpretable Model-agnostic Explanations (LIME) framework, were integrated to provide transparent, visual explanations of the model’s predictions, enhancing interpretability and user trust. To further assess generalizability, the evaluation was extended beyond the initial dataset by incorporating three additional benchmark datasets: FaceForensics++, Celeb-DF (v2), and the DFDC Preview Set. These datasets contain a range of manipulation techniques, allowing for comprehensive testing of the model’s robustness across different scenarios. The proposed method outperformed baselines with exceptional performance metrics (accuracy, precision, recall, and F1-score up to 0.99), and maintained strong results across different datasets. These findings demonstrate that combining XAI approaches with a VGG16 model and thorough preprocessing effectively counters advanced deepfake generation techniques, such as StyleGAN2. This research contributes to a safer digital landscape by improving the detection and understanding of manipulated content, providing a practical way to confront the growing threat of deepfakes.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Image and Vision Computing 工程技术-工程：电子与电气

CiteScore

8.50

自引率

8.50%

发文量

143

审稿时长

7.8 months

期刊介绍： Image and Vision Computing has as a primary aim the provision of an effective medium of interchange for the results of high quality theoretical and applied research fundamental to all aspects of image interpretation and computer vision. The journal publishes work that proposes new image interpretation and computer vision methodology or addresses the application of such methods to real world scenes. It seeks to strengthen a deeper understanding in the discipline by encouraging the quantitative comparison and performance evaluation of the proposed methodology. The coverage includes: image interpretation, scene modelling, object recognition and tracking, shape analysis, monitoring and surveillance, active vision and robotic systems, SLAM, biologically-inspired computer vision, motion analysis, stereo vision, document image understanding, character and handwritten text recognition, face and gesture recognition, biometrics, vision-based human-computer interaction, human activity and behavior understanding, data fusion from multiple sensor inputs, image databases.