Compression-Aware Hybrid Framework for Deep Fake Detection in Low-Quality Video

IF 3.6 3区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Access Pub Date : 2025-07-24 DOI:10.1109/ACCESS.2025.3592358

Lagsoun Abdel Motalib;Oujaoura Mustapha;Hedabou Mustapha

{"title":"Compression-Aware Hybrid Framework for Deep Fake Detection in Low-Quality Video","authors":"Lagsoun Abdel Motalib;Oujaoura Mustapha;Hedabou Mustapha","doi":"10.1109/ACCESS.2025.3592358","DOIUrl":null,"url":null,"abstract":"Deep fakes pose a growing threat to digital media integrity by generating highly realistic fake videos that are difficult to detect, especially under the high compression levels commonly used on social media platforms. These compression artifacts often degrade the performance of deep fake detectors, making reliable detection even more challenging. In this paper, we propose a handcrafted deep fake detection framework that integrates wavelet transforms and Conv3D-based spatiotemporal descriptors for feature extraction, followed by a lightweight ResNet-inspired classifier. Unlike end-to-end deep neural networks, our method emphasizes interpretability and computational efficiency, while maintaining high detection accuracy under diverse real-world conditions. We evaluated four configurations based on input modality and attention mechanisms: RGB with attention, RGB without attention, grayscale with attention, and grayscale without attention. Experiments were conducted on the FaceForensics++ dataset (C23 and C40 compression levels) and Celeb-DF v2 (C0 and C40), across intra- and inter-compression settings, as well as cross-dataset scenarios. Results show that RGB inputs without attention achieve the highest accuracy on FaceForensics++, while grayscale inputs without attention perform best in cross-dataset evaluations on Celeb-DF v2, attaining strong AUC scores. Despite its handcrafted nature, our approach matches or surpasses the existing state-of-the-art (SOTA) methods. Grad-CAM visualizations further reveal both strengths and failures (e.g., occlusion and misalignment), offering valuable insights for refinement. These findings underscore the potential of our framework for efficient and effective deep fake detection in low-resource and real-time environments.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"13 ","pages":"131980-131997"},"PeriodicalIF":3.6000,"publicationDate":"2025-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11095666","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Access","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11095666/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Deep fakes pose a growing threat to digital media integrity by generating highly realistic fake videos that are difficult to detect, especially under the high compression levels commonly used on social media platforms. These compression artifacts often degrade the performance of deep fake detectors, making reliable detection even more challenging. In this paper, we propose a handcrafted deep fake detection framework that integrates wavelet transforms and Conv3D-based spatiotemporal descriptors for feature extraction, followed by a lightweight ResNet-inspired classifier. Unlike end-to-end deep neural networks, our method emphasizes interpretability and computational efficiency, while maintaining high detection accuracy under diverse real-world conditions. We evaluated four configurations based on input modality and attention mechanisms: RGB with attention, RGB without attention, grayscale with attention, and grayscale without attention. Experiments were conducted on the FaceForensics++ dataset (C23 and C40 compression levels) and Celeb-DF v2 (C0 and C40), across intra- and inter-compression settings, as well as cross-dataset scenarios. Results show that RGB inputs without attention achieve the highest accuracy on FaceForensics++, while grayscale inputs without attention perform best in cross-dataset evaluations on Celeb-DF v2, attaining strong AUC scores. Despite its handcrafted nature, our approach matches or surpasses the existing state-of-the-art (SOTA) methods. Grad-CAM visualizations further reveal both strengths and failures (e.g., occlusion and misalignment), offering valuable insights for refinement. These findings underscore the potential of our framework for efficient and effective deep fake detection in low-resource and real-time environments.

查看原文本刊更多论文

基于压缩感知的低质量视频深度假检测混合框架

深度造假对数字媒体的完整性构成了越来越大的威胁，因为它生成了难以检测的高度逼真的假视频，尤其是在社交媒体平台上常用的高压缩水平下。这些压缩伪影通常会降低深度假检测器的性能，使可靠的检测更具挑战性。在本文中，我们提出了一个手工制作的深度假检测框架，该框架集成了小波变换和基于conv3d的时空描述符进行特征提取，然后是一个轻量级的resnet启发的分类器。与端到端深度神经网络不同，我们的方法强调可解释性和计算效率，同时在不同的现实世界条件下保持较高的检测精度。我们基于输入方式和注意机制评估了四种配置：有注意的RGB、无注意的RGB、有注意的灰度和无注意的灰度。在face取证++数据集（C23和C40压缩级别）和Celeb-DF v2 （C0和C40）上进行了跨内部和内部压缩设置以及跨数据集场景的实验。结果表明，不加注意的RGB输入在facefrensics ++上的准确率最高，而不加注意的灰度输入在Celeb-DF v2上的跨数据集评估中表现最佳，获得了较强的AUC分数。尽管它是手工制作的，但我们的方法匹配或超越了现有的最先进的（SOTA）方法。Grad-CAM可视化进一步揭示了优点和缺点（例如，遮挡和不对齐），为改进提供了有价值的见解。这些发现强调了我们的框架在低资源和实时环境中高效和有效的深度假检测的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Access COMPUTER SCIENCE, INFORMATION SYSTEMSENGIN-ENGINEERING, ELECTRICAL & ELECTRONIC

CiteScore

9.80

自引率

7.70%

发文量

6673

审稿时长

6 weeks

期刊介绍： IEEE Access® is a multidisciplinary, open access (OA), applications-oriented, all-electronic archival journal that continuously presents the results of original research or development across all of IEEE''s fields of interest. IEEE Access will publish articles that are of high interest to readers, original, technically correct, and clearly presented. Supported by author publication charges (APC), its hallmarks are a rapid peer review and publication process with open access to all readers. Unlike IEEE''s traditional Transactions or Journals, reviews are "binary", in that reviewers will either Accept or Reject an article in the form it is submitted in order to achieve rapid turnaround. Especially encouraged are submissions on: Multidisciplinary topics, or applications-oriented articles and negative results that do not fit within the scope of IEEE''s traditional journals. Practical articles discussing new experiments or measurement techniques, interesting solutions to engineering. Development of new or improved fabrication or manufacturing techniques. Reviews or survey articles of new or evolving fields oriented to assist others in understanding the new area.