Fazeel Zafar;Talha Ahmed Khan;Salas Akbar;Muhammad Talha Ubaid;Sameena Javaid;Kushsairy Abdul Kadir
{"title":"A Hybrid Deep Learning Framework for Deepfake Detection Using Temporal and Spatial Features","authors":"Fazeel Zafar;Talha Ahmed Khan;Salas Akbar;Muhammad Talha Ubaid;Sameena Javaid;Kushsairy Abdul Kadir","doi":"10.1109/ACCESS.2025.3566008","DOIUrl":null,"url":null,"abstract":"The rise of deep-fake technology has sparked concerns as it blurs the distinction between fake media by harnessing Generative Adversarial Networks (GANs). This has raised issues surrounding privacy and security in the realm. This has led to a decrease in trust during online interactions; thus, emphasizing the importance of creating reliable methods for detection purposes. Our research introduces a model for detecting deepfakes by utilizing an Enhanced EfficientNet B0 structure in conjunction with Temporal Convolutional Neural Networks (TempCNNs). This approach aims to tackle the challenges presented by the evolving sophistication of deep-fake techniques. The system dissects video inputs into frames to extract features comprehensively by using Multi Test Convolutional Networks (MTCNN). This method ensures face detection and alignment by focusing on facial regions. To enhance the model’s adaptability, to different scenarios and datasets we implement data augmentation techniques such as CutMix, MixUp and Random Erasing. These strategies help the model maintain its strength, against distortions found in deepfake content. The backbone of EfficientNet B0 utilizes Mobile Inverted Bottleneck Convolutions (MBConv) and Squeeze and Excitation (SE) blocks to enhance feature extraction by adjusting channels to highlight details effectively. A Feature Pyramid Network (FPN) facilitates the fusion of scale features capturing intricate details as well, as broader context. When tested on the FFIW 10 K dataset, which comprises 10,000 videos evenly split between manipulated content, the model attained a training accuracy of 91.5 % and a testing accuracy of 92.45 %, after 40 epochs. The findings showcase the model’s proficiency, in identifying videos with precision and tackling the issue of class imbalances found in datasets – a valuable contribution, to advancing dependable deepfake detection solutions. Furthermore, the model achieves an impressive balance between accuracy and computational efficiency, attaining 92.45% testing accuracy with a lightweight computational cost of 0.45 GFLOPs, making it a highly practical choice for real-world deployment.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"13 ","pages":"79560-79570"},"PeriodicalIF":3.4000,"publicationDate":"2025-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10981422","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Access","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10981422/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
The rise of deep-fake technology has sparked concerns as it blurs the distinction between fake media by harnessing Generative Adversarial Networks (GANs). This has raised issues surrounding privacy and security in the realm. This has led to a decrease in trust during online interactions; thus, emphasizing the importance of creating reliable methods for detection purposes. Our research introduces a model for detecting deepfakes by utilizing an Enhanced EfficientNet B0 structure in conjunction with Temporal Convolutional Neural Networks (TempCNNs). This approach aims to tackle the challenges presented by the evolving sophistication of deep-fake techniques. The system dissects video inputs into frames to extract features comprehensively by using Multi Test Convolutional Networks (MTCNN). This method ensures face detection and alignment by focusing on facial regions. To enhance the model’s adaptability, to different scenarios and datasets we implement data augmentation techniques such as CutMix, MixUp and Random Erasing. These strategies help the model maintain its strength, against distortions found in deepfake content. The backbone of EfficientNet B0 utilizes Mobile Inverted Bottleneck Convolutions (MBConv) and Squeeze and Excitation (SE) blocks to enhance feature extraction by adjusting channels to highlight details effectively. A Feature Pyramid Network (FPN) facilitates the fusion of scale features capturing intricate details as well, as broader context. When tested on the FFIW 10 K dataset, which comprises 10,000 videos evenly split between manipulated content, the model attained a training accuracy of 91.5 % and a testing accuracy of 92.45 %, after 40 epochs. The findings showcase the model’s proficiency, in identifying videos with precision and tackling the issue of class imbalances found in datasets – a valuable contribution, to advancing dependable deepfake detection solutions. Furthermore, the model achieves an impressive balance between accuracy and computational efficiency, attaining 92.45% testing accuracy with a lightweight computational cost of 0.45 GFLOPs, making it a highly practical choice for real-world deployment.
IEEE AccessCOMPUTER SCIENCE, INFORMATION SYSTEMSENGIN-ENGINEERING, ELECTRICAL & ELECTRONIC
CiteScore
9.80
自引率
7.70%
发文量
6673
审稿时长
6 weeks
期刊介绍:
IEEE Access® is a multidisciplinary, open access (OA), applications-oriented, all-electronic archival journal that continuously presents the results of original research or development across all of IEEE''s fields of interest.
IEEE Access will publish articles that are of high interest to readers, original, technically correct, and clearly presented. Supported by author publication charges (APC), its hallmarks are a rapid peer review and publication process with open access to all readers. Unlike IEEE''s traditional Transactions or Journals, reviews are "binary", in that reviewers will either Accept or Reject an article in the form it is submitted in order to achieve rapid turnaround. Especially encouraged are submissions on:
Multidisciplinary topics, or applications-oriented articles and negative results that do not fit within the scope of IEEE''s traditional journals.
Practical articles discussing new experiments or measurement techniques, interesting solutions to engineering.
Development of new or improved fabrication or manufacturing techniques.
Reviews or survey articles of new or evolving fields oriented to assist others in understanding the new area.