{"title":"基于双流卷积神经网络学习两级特征的深度假视频检测","authors":"Zheng Zhao, Penghui Wang, W. Lu","doi":"10.1145/3404555.3404564","DOIUrl":null,"url":null,"abstract":"Deepfake techniques has made face swapping in video easy to use. Nowadays, the spread of Deepfake videos over networks is concerned worldwide. This work proposes an approach to more accurate and robust detection of them. Since artifacts left by Deepfake tools can be largely categorized into two classes of different levels, i.e. semantic and noise level, we adopt a two-stream convolutional neural network (CNN) to capture the 2-level features concurrently. Xception network is trained only as the first stream to detect semantic anomalies such as the editing artifacts around face contour, detail missing, and geometric inconsistence in eyes. Meanwhile, the 2nd stream, which contain the constrained convolution filter and median filter, is designed to capture the tampering traces in local noises. By concatenating the 2-level features learned from the both streams, our method obtains very comprehensive knowledge about the existence of face swapping. The experimental results have shown its advantage over the existing methods on both the accuracy and robustness.","PeriodicalId":220526,"journal":{"name":"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Detecting Deepfake Video by Learning Two-Level Features with Two-Stream Convolutional Neural Network\",\"authors\":\"Zheng Zhao, Penghui Wang, W. Lu\",\"doi\":\"10.1145/3404555.3404564\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deepfake techniques has made face swapping in video easy to use. Nowadays, the spread of Deepfake videos over networks is concerned worldwide. This work proposes an approach to more accurate and robust detection of them. Since artifacts left by Deepfake tools can be largely categorized into two classes of different levels, i.e. semantic and noise level, we adopt a two-stream convolutional neural network (CNN) to capture the 2-level features concurrently. Xception network is trained only as the first stream to detect semantic anomalies such as the editing artifacts around face contour, detail missing, and geometric inconsistence in eyes. Meanwhile, the 2nd stream, which contain the constrained convolution filter and median filter, is designed to capture the tampering traces in local noises. By concatenating the 2-level features learned from the both streams, our method obtains very comprehensive knowledge about the existence of face swapping. The experimental results have shown its advantage over the existing methods on both the accuracy and robustness.\",\"PeriodicalId\":220526,\"journal\":{\"name\":\"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-04-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3404555.3404564\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3404555.3404564","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Detecting Deepfake Video by Learning Two-Level Features with Two-Stream Convolutional Neural Network
Deepfake techniques has made face swapping in video easy to use. Nowadays, the spread of Deepfake videos over networks is concerned worldwide. This work proposes an approach to more accurate and robust detection of them. Since artifacts left by Deepfake tools can be largely categorized into two classes of different levels, i.e. semantic and noise level, we adopt a two-stream convolutional neural network (CNN) to capture the 2-level features concurrently. Xception network is trained only as the first stream to detect semantic anomalies such as the editing artifacts around face contour, detail missing, and geometric inconsistence in eyes. Meanwhile, the 2nd stream, which contain the constrained convolution filter and median filter, is designed to capture the tampering traces in local noises. By concatenating the 2-level features learned from the both streams, our method obtains very comprehensive knowledge about the existence of face swapping. The experimental results have shown its advantage over the existing methods on both the accuracy and robustness.