{"title":"Autoencoder-based Data Augmentation for Deepfake Detection","authors":"Dan-Cristian Stanciu, B. Ionescu","doi":"10.1145/3592572.3592840","DOIUrl":null,"url":null,"abstract":"Image generation has seen huge leaps in the last few years. Less than 10 years ago we could not generate accurate images using deep learning at all, and now it is almost impossible for the average person to distinguish a real image from a generated one. In spite of the fact that image generation has some amazing use cases, it can also be used with ill intent. As an example, deepfakes have become more and more indistinguishable from real pictures and that poses a real threat to society. It is important for us to be vigilant and active against deepfakes, to ensure that the false information spread is kept under control. In this context, the need for good deepfake detectors feels more and more urgent. There is a constant battle between deepfake generators and deepfake detection algorithms, each one evolving at a rapid pace. But, there is a big problem with deepfake detectors: they can only be trained on so many data points and images generated by specific architectures. Therefore, while we can detect deepfakes on certain datasets with near 100% accuracy, it is sometimes very hard to generalize and catch all real-world instances. Our proposed solution is a way to augment deepfake detection datasets using deep learning architectures, such as Autoencoders or U-Net. We show that augmenting deepfake detection datasets using deep learning improves generalization to other datasets. We test our algorithm using multiple architectures, with experimental validation being carried out on state-of-the-art datasets like CelebDF and DFDC Preview. The framework we propose can give flexibility to any model, helping to generalize to unseen datasets and manipulations.","PeriodicalId":239252,"journal":{"name":"Proceedings of the 2nd ACM International Workshop on Multimedia AI against Disinformation","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2nd ACM International Workshop on Multimedia AI against Disinformation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3592572.3592840","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Image generation has seen huge leaps in the last few years. Less than 10 years ago we could not generate accurate images using deep learning at all, and now it is almost impossible for the average person to distinguish a real image from a generated one. In spite of the fact that image generation has some amazing use cases, it can also be used with ill intent. As an example, deepfakes have become more and more indistinguishable from real pictures and that poses a real threat to society. It is important for us to be vigilant and active against deepfakes, to ensure that the false information spread is kept under control. In this context, the need for good deepfake detectors feels more and more urgent. There is a constant battle between deepfake generators and deepfake detection algorithms, each one evolving at a rapid pace. But, there is a big problem with deepfake detectors: they can only be trained on so many data points and images generated by specific architectures. Therefore, while we can detect deepfakes on certain datasets with near 100% accuracy, it is sometimes very hard to generalize and catch all real-world instances. Our proposed solution is a way to augment deepfake detection datasets using deep learning architectures, such as Autoencoders or U-Net. We show that augmenting deepfake detection datasets using deep learning improves generalization to other datasets. We test our algorithm using multiple architectures, with experimental validation being carried out on state-of-the-art datasets like CelebDF and DFDC Preview. The framework we propose can give flexibility to any model, helping to generalize to unseen datasets and manipulations.