{"title":"Detecting Deep-Fake Videos from Appearance and Behavior","authors":"S. Agarwal, Tarek El-Gaaly, H. Farid, Ser-Nam Lim","doi":"10.1109/WIFS49906.2020.9360904","DOIUrl":null,"url":null,"abstract":"Synthetically-generated audios and videos - so-called deep fakes - continue to capture the imagination of the computer-graphics and computer-vision communities. At the same time, the democratization of access to technology that can create a sophisticated manipulated video of anybody saying anything continues to be of concern because of its power to disrupt democratic elections, commit small to large-scale fraud, fuel disinformation campaigns, and create non-consensual pornography. We describe a biometric-based forensic technique for detecting face-swap deep fakes. This technique combines a static biometric based on facial recognition with a temporal, behavioral biometric based on facial expressions and head movements, where the behavioral embedding is learned using a CNN with a metric-learning objective function. We show the efficacy of this approach across several large-scale video datasets, as well as in-the-wild deep fakes.","PeriodicalId":354881,"journal":{"name":"2020 IEEE International Workshop on Information Forensics and Security (WIFS)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"100","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Workshop on Information Forensics and Security (WIFS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WIFS49906.2020.9360904","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 100
Abstract
Synthetically-generated audios and videos - so-called deep fakes - continue to capture the imagination of the computer-graphics and computer-vision communities. At the same time, the democratization of access to technology that can create a sophisticated manipulated video of anybody saying anything continues to be of concern because of its power to disrupt democratic elections, commit small to large-scale fraud, fuel disinformation campaigns, and create non-consensual pornography. We describe a biometric-based forensic technique for detecting face-swap deep fakes. This technique combines a static biometric based on facial recognition with a temporal, behavioral biometric based on facial expressions and head movements, where the behavioral embedding is learned using a CNN with a metric-learning objective function. We show the efficacy of this approach across several large-scale video datasets, as well as in-the-wild deep fakes.