{"title":"Deepfakes Examiner: An End-to-End Deep Learning Model for Deepfakes Videos Detection","authors":"Hafsa Ilyas, Aun Irtaza, A. Javed, K. Malik","doi":"10.1109/ICOSST57195.2022.10016871","DOIUrl":null,"url":null,"abstract":"Deepfakes generation approaches have made it possible even for less technical users to generate fake videos using only the source and target images. Thus, the threats associated with deepfake video generation such as impersonating public figures, defamation, and spreading disinformation on media platforms have increased exponentially. The significant improvement in the deepfakes generation techniques necessitates the development of effective deepfakes detection methods to counter disinformation threats. Existing techniques do not provide reliable deepfakes detection particularly when the videos are generated using different deepfakes generation techniques and contain variations in illumination conditions and diverse ethnicities. Therefore, this paper proposes a novel hybrid deep learning framework, InceptionResNet-BiLSTM, that is robust to different ethnicities and varied illumination conditions, and able to detect deepfake videos generated using different techniques. The proposed InceptionResNet-BiLSTM consists of two components: customized InceptionResNetV2 and Bidirectional Long-Short Term Memory (BiLSTM). In our proposed framework, faces extracted from the videos are fed to our customized InceptionResNetV2 for extracting frame-level learnable features. The sequences of features are then used to train a temporally aware BiLSTM to classify between the real and fake video. We evaluated our proposed approach on the diverse, standard, and largescale FaceForensics++ (FF++) dataset containing videos manipulated using different techniques (i.e., DeepFakes, FaceSwap, Face2Face, FaceShifter, and NeuralTextures) and the FakeA VCeleb dataset. Our method achieved an accuracy greater than 90% on DeepFakes, FaceSwap, and Face2Face subsets. Performance and generalizability evaluation highlights the effectiveness of our method for detecting deepfake videos generated through different techniques on diverse FF++ and FakeA VCeleb datasets.","PeriodicalId":238082,"journal":{"name":"2022 16th International Conference on Open Source Systems and Technologies (ICOSST)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 16th International Conference on Open Source Systems and Technologies (ICOSST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICOSST57195.2022.10016871","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Deepfakes generation approaches have made it possible even for less technical users to generate fake videos using only the source and target images. Thus, the threats associated with deepfake video generation such as impersonating public figures, defamation, and spreading disinformation on media platforms have increased exponentially. The significant improvement in the deepfakes generation techniques necessitates the development of effective deepfakes detection methods to counter disinformation threats. Existing techniques do not provide reliable deepfakes detection particularly when the videos are generated using different deepfakes generation techniques and contain variations in illumination conditions and diverse ethnicities. Therefore, this paper proposes a novel hybrid deep learning framework, InceptionResNet-BiLSTM, that is robust to different ethnicities and varied illumination conditions, and able to detect deepfake videos generated using different techniques. The proposed InceptionResNet-BiLSTM consists of two components: customized InceptionResNetV2 and Bidirectional Long-Short Term Memory (BiLSTM). In our proposed framework, faces extracted from the videos are fed to our customized InceptionResNetV2 for extracting frame-level learnable features. The sequences of features are then used to train a temporally aware BiLSTM to classify between the real and fake video. We evaluated our proposed approach on the diverse, standard, and largescale FaceForensics++ (FF++) dataset containing videos manipulated using different techniques (i.e., DeepFakes, FaceSwap, Face2Face, FaceShifter, and NeuralTextures) and the FakeA VCeleb dataset. Our method achieved an accuracy greater than 90% on DeepFakes, FaceSwap, and Face2Face subsets. Performance and generalizability evaluation highlights the effectiveness of our method for detecting deepfake videos generated through different techniques on diverse FF++ and FakeA VCeleb datasets.