DFCP:通过对比预训练的少镜头深度假检测

2023 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2023-07-01 DOI:10.1109/ICME55011.2023.00393

Bojing Zou, Chao Yang, Jiazhi Guan, Chengbin Quan, Youjian Zhao

{"title":"DFCP:通过对比预训练的少镜头深度假检测","authors":"Bojing Zou, Chao Yang, Jiazhi Guan, Chengbin Quan, Youjian Zhao","doi":"10.1109/ICME55011.2023.00393","DOIUrl":null,"url":null,"abstract":"Abuses of forgery techniques have created a considerable problem of misinformation on social media. Although scholars devote many efforts to face forgery detection (a.k.a DeepFake detection) and achieve some results, two issues still hinder the practical application. 1) Most detectors do not generalize well to unseen datasets. 2) In a supervised manner, most previous works require a considerable amount of manually labeled data. To address these problems, we propose a simple contrastive pertaining framework for DeepFake detection (DFCP), which works in a finetuning-after-pretraining manner, and requires only a few labels (5%). Specifically, we design a two-stream framework to simultaneously learn high-frequency texture features and high-level semantics information during pretraining. In addition, a video-based frame sampling strategy is proposed to mitigate potential noise data in the instance-discriminative contrastive learning to achieve better performance. Experimental results on several downstream datasets show the state-of-the-art performance of the proposed DFCP, which works at frame-level (w/o temporal reasoning) with high efficiency but outperforms video-level methods.","PeriodicalId":321830,"journal":{"name":"2023 IEEE International Conference on Multimedia and Expo (ICME)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"DFCP: Few-Shot DeepFake Detection via Contrastive Pretraining\",\"authors\":\"Bojing Zou, Chao Yang, Jiazhi Guan, Chengbin Quan, Youjian Zhao\",\"doi\":\"10.1109/ICME55011.2023.00393\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abuses of forgery techniques have created a considerable problem of misinformation on social media. Although scholars devote many efforts to face forgery detection (a.k.a DeepFake detection) and achieve some results, two issues still hinder the practical application. 1) Most detectors do not generalize well to unseen datasets. 2) In a supervised manner, most previous works require a considerable amount of manually labeled data. To address these problems, we propose a simple contrastive pertaining framework for DeepFake detection (DFCP), which works in a finetuning-after-pretraining manner, and requires only a few labels (5%). Specifically, we design a two-stream framework to simultaneously learn high-frequency texture features and high-level semantics information during pretraining. In addition, a video-based frame sampling strategy is proposed to mitigate potential noise data in the instance-discriminative contrastive learning to achieve better performance. Experimental results on several downstream datasets show the state-of-the-art performance of the proposed DFCP, which works at frame-level (w/o temporal reasoning) with high efficiency but outperforms video-level methods.\",\"PeriodicalId\":321830,\"journal\":{\"name\":\"2023 IEEE International Conference on Multimedia and Expo (ICME)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE International Conference on Multimedia and Expo (ICME)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICME55011.2023.00393\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International Conference on Multimedia and Expo (ICME)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICME55011.2023.00393","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

伪造技术的滥用在社交媒体上造成了相当大的错误信息问题。尽管学者们在人脸伪造检测(又称DeepFake检测)方面投入了大量的努力并取得了一些成果，但仍有两个问题阻碍了实际应用。1)大多数检测器不能很好地推广到未见过的数据集。2)在监督的方式下，大多数以前的工作需要大量的人工标记数据。为了解决这些问题，我们提出了一个简单的DeepFake检测(DFCP)的对比框架，它以预训练后的微调方式工作，只需要少量的标签(5%)。具体来说，我们设计了一个双流框架，在预训练过程中同时学习高频纹理特征和高级语义信息。此外，提出了一种基于视频的帧采样策略，以减轻实例判别对比学习中潜在的噪声数据，从而获得更好的学习性能。实验结果在几个下游数据集显示DFCP提出的最先进的性能,这在框架水准仪(w / o时态推理)效率高但优于图象电平的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

DFCP: Few-Shot DeepFake Detection via Contrastive Pretraining

Abuses of forgery techniques have created a considerable problem of misinformation on social media. Although scholars devote many efforts to face forgery detection (a.k.a DeepFake detection) and achieve some results, two issues still hinder the practical application. 1) Most detectors do not generalize well to unseen datasets. 2) In a supervised manner, most previous works require a considerable amount of manually labeled data. To address these problems, we propose a simple contrastive pertaining framework for DeepFake detection (DFCP), which works in a finetuning-after-pretraining manner, and requires only a few labels (5%). Specifically, we design a two-stream framework to simultaneously learn high-frequency texture features and high-level semantics information during pretraining. In addition, a video-based frame sampling strategy is proposed to mitigate potential noise data in the instance-discriminative contrastive learning to achieve better performance. Experimental results on several downstream datasets show the state-of-the-art performance of the proposed DFCP, which works at frame-level (w/o temporal reasoning) with high efficiency but outperforms video-level methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2023 IEEE International Conference on Multimedia and Expo (ICME)

自引率

0.00%

发文量