{"title":"DFCP:通过对比预训练的少镜头深度假检测","authors":"Bojing Zou, Chao Yang, Jiazhi Guan, Chengbin Quan, Youjian Zhao","doi":"10.1109/ICME55011.2023.00393","DOIUrl":null,"url":null,"abstract":"Abuses of forgery techniques have created a considerable problem of misinformation on social media. Although scholars devote many efforts to face forgery detection (a.k.a DeepFake detection) and achieve some results, two issues still hinder the practical application. 1) Most detectors do not generalize well to unseen datasets. 2) In a supervised manner, most previous works require a considerable amount of manually labeled data. To address these problems, we propose a simple contrastive pertaining framework for DeepFake detection (DFCP), which works in a finetuning-after-pretraining manner, and requires only a few labels (5%). Specifically, we design a two-stream framework to simultaneously learn high-frequency texture features and high-level semantics information during pretraining. In addition, a video-based frame sampling strategy is proposed to mitigate potential noise data in the instance-discriminative contrastive learning to achieve better performance. Experimental results on several downstream datasets show the state-of-the-art performance of the proposed DFCP, which works at frame-level (w/o temporal reasoning) with high efficiency but outperforms video-level methods.","PeriodicalId":321830,"journal":{"name":"2023 IEEE International Conference on Multimedia and Expo (ICME)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"DFCP: Few-Shot DeepFake Detection via Contrastive Pretraining\",\"authors\":\"Bojing Zou, Chao Yang, Jiazhi Guan, Chengbin Quan, Youjian Zhao\",\"doi\":\"10.1109/ICME55011.2023.00393\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abuses of forgery techniques have created a considerable problem of misinformation on social media. Although scholars devote many efforts to face forgery detection (a.k.a DeepFake detection) and achieve some results, two issues still hinder the practical application. 1) Most detectors do not generalize well to unseen datasets. 2) In a supervised manner, most previous works require a considerable amount of manually labeled data. To address these problems, we propose a simple contrastive pertaining framework for DeepFake detection (DFCP), which works in a finetuning-after-pretraining manner, and requires only a few labels (5%). Specifically, we design a two-stream framework to simultaneously learn high-frequency texture features and high-level semantics information during pretraining. In addition, a video-based frame sampling strategy is proposed to mitigate potential noise data in the instance-discriminative contrastive learning to achieve better performance. Experimental results on several downstream datasets show the state-of-the-art performance of the proposed DFCP, which works at frame-level (w/o temporal reasoning) with high efficiency but outperforms video-level methods.\",\"PeriodicalId\":321830,\"journal\":{\"name\":\"2023 IEEE International Conference on Multimedia and Expo (ICME)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE International Conference on Multimedia and Expo (ICME)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICME55011.2023.00393\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International Conference on Multimedia and Expo (ICME)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICME55011.2023.00393","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
DFCP: Few-Shot DeepFake Detection via Contrastive Pretraining
Abuses of forgery techniques have created a considerable problem of misinformation on social media. Although scholars devote many efforts to face forgery detection (a.k.a DeepFake detection) and achieve some results, two issues still hinder the practical application. 1) Most detectors do not generalize well to unseen datasets. 2) In a supervised manner, most previous works require a considerable amount of manually labeled data. To address these problems, we propose a simple contrastive pertaining framework for DeepFake detection (DFCP), which works in a finetuning-after-pretraining manner, and requires only a few labels (5%). Specifically, we design a two-stream framework to simultaneously learn high-frequency texture features and high-level semantics information during pretraining. In addition, a video-based frame sampling strategy is proposed to mitigate potential noise data in the instance-discriminative contrastive learning to achieve better performance. Experimental results on several downstream datasets show the state-of-the-art performance of the proposed DFCP, which works at frame-level (w/o temporal reasoning) with high efficiency but outperforms video-level methods.