One Detector to Rule Them All: Towards a General Deepfake Attack Detection Framework

Shahroz Tariq, Sangyup Lee, Simon S. Woo
{"title":"One Detector to Rule Them All: Towards a General Deepfake Attack Detection Framework","authors":"Shahroz Tariq, Sangyup Lee, Simon S. Woo","doi":"10.1145/3442381.3449809","DOIUrl":null,"url":null,"abstract":"Deep learning-based video manipulation methods have become widely accessible to the masses. With little to no effort, people can quickly learn how to generate deepfake (DF) videos. While deep learning-based detection methods have been proposed to identify specific types of DFs, their performance suffers for other types of deepfake methods, including real-world deepfakes, on which they are not sufficiently trained. In other words, most of the proposed deep learning-based detection methods lack transferability and generalizability. Beyond detecting a single type of DF from benchmark deepfake datasets, we focus on developing a generalized approach to detect multiple types of DFs, including deepfakes from unknown generation methods such as DeepFake-in-the-Wild (DFW) videos. To better cope with unknown and unseen deepfakes, we introduce a Convolutional LSTM-based Residual Network (CLRNet), which adopts a unique model training strategy and explores spatial as well as the temporal information in a deepfakes. Through extensive experiments, we show that existing defense methods are not ready for real-world deployment. Whereas our defense method (CLRNet) achieves far better generalization when detecting various benchmark deepfake methods (97.57% on average). Furthermore, we evaluate our approach with a high-quality DeepFake-in-the-Wild dataset, collected from the Internet containing numerous videos and having more than 150,000 frames. Our CLRNet model demonstrated that it generalizes well against high-quality DFW videos by achieving 93.86% detection accuracy, outperforming existing state-of-the-art defense methods by a considerable margin.","PeriodicalId":106672,"journal":{"name":"Proceedings of the Web Conference 2021","volume":"218 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"51","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Web Conference 2021","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3442381.3449809","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 51

Abstract

Deep learning-based video manipulation methods have become widely accessible to the masses. With little to no effort, people can quickly learn how to generate deepfake (DF) videos. While deep learning-based detection methods have been proposed to identify specific types of DFs, their performance suffers for other types of deepfake methods, including real-world deepfakes, on which they are not sufficiently trained. In other words, most of the proposed deep learning-based detection methods lack transferability and generalizability. Beyond detecting a single type of DF from benchmark deepfake datasets, we focus on developing a generalized approach to detect multiple types of DFs, including deepfakes from unknown generation methods such as DeepFake-in-the-Wild (DFW) videos. To better cope with unknown and unseen deepfakes, we introduce a Convolutional LSTM-based Residual Network (CLRNet), which adopts a unique model training strategy and explores spatial as well as the temporal information in a deepfakes. Through extensive experiments, we show that existing defense methods are not ready for real-world deployment. Whereas our defense method (CLRNet) achieves far better generalization when detecting various benchmark deepfake methods (97.57% on average). Furthermore, we evaluate our approach with a high-quality DeepFake-in-the-Wild dataset, collected from the Internet containing numerous videos and having more than 150,000 frames. Our CLRNet model demonstrated that it generalizes well against high-quality DFW videos by achieving 93.86% detection accuracy, outperforming existing state-of-the-art defense methods by a considerable margin.
一个检测器统治所有:走向一个通用的深度伪造攻击检测框架
基于深度学习的视频处理方法已经被大众广泛使用。人们可以毫不费力地快速学习如何生成深度造假(DF)视频。虽然已经提出了基于深度学习的检测方法来识别特定类型的df,但它们的性能在其他类型的深度伪造方法中受到影响,包括现实世界的深度伪造,因为它们没有得到充分的训练。换句话说,大多数提出的基于深度学习的检测方法缺乏可转移性和泛化性。除了从基准deepfake数据集检测单一类型的DF之外,我们还专注于开发一种通用的方法来检测多种类型的DF,包括来自未知生成方法的深度伪造,例如deepfake -in- wild (DFW)视频。为了更好地处理未知和不可见的深度伪造,我们引入了一种基于卷积lstm的残差网络(CLRNet),该网络采用独特的模型训练策略,探索深度伪造中的空间和时间信息。通过大量的实验,我们表明现有的防御方法还没有为现实世界的部署做好准备。而我们的防御方法(CLRNet)在检测各种基准深度伪造方法时实现了更好的泛化(平均为97.57%)。此外,我们用一个高质量的DeepFake-in-the-Wild数据集来评估我们的方法,该数据集收集自互联网,包含大量视频,超过15万帧。我们的CLRNet模型表明,通过达到93.86%的检测准确率,它可以很好地泛化高质量的DFW视频,大大优于现有的最先进的防御方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信