{"title":"SWYNT: Swin Y-Net Transformers for Deepfake Detection","authors":"Fatima Khalid, Muhammad Haseed Akbar, Shehla Gul","doi":"10.1109/ICRAI57502.2023.10089585","DOIUrl":null,"url":null,"abstract":"Nowadays, less technical individuals can create false videos by only source and target images, using deepfakes generation tools and methodologies. Distributing false information on social media and other concerns related to the deepfakes have thus significantly increased. To deal with the challenges posed by incorrect details, efficient Deepfakes detection algorithms must be developed considering the tremendous advancement in deepfakes generating techniques. Existing techniques are not reliable enough to find deepfakes, especially when the videos are made with various deepfakes generation methods. The Swin Y-Net Transformers (SWYNT) architecture we created in this paper can visually discriminate between natural and artificial faces. The architecture uses a Swin transformer, encoder, and decoder in a U -Net architecture with a classification branch to build a model that can classify and segment deepfakes. The segmentation process creates segmentation masks and helps train the classifier. We have evaluated our suggested method using the extensive, standard, and diverse FaceForensics++ (FF++) and the Celeb-DF dataset. The generalizability evaluation of our process, which is part of the performance evaluation, reveals the model's promising performance in identifying deepfakes videos generated using various methodologies on both large-scale datasets.","PeriodicalId":447565,"journal":{"name":"2023 International Conference on Robotics and Automation in Industry (ICRAI)","volume":"78 7","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 International Conference on Robotics and Automation in Industry (ICRAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICRAI57502.2023.10089585","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Nowadays, less technical individuals can create false videos by only source and target images, using deepfakes generation tools and methodologies. Distributing false information on social media and other concerns related to the deepfakes have thus significantly increased. To deal with the challenges posed by incorrect details, efficient Deepfakes detection algorithms must be developed considering the tremendous advancement in deepfakes generating techniques. Existing techniques are not reliable enough to find deepfakes, especially when the videos are made with various deepfakes generation methods. The Swin Y-Net Transformers (SWYNT) architecture we created in this paper can visually discriminate between natural and artificial faces. The architecture uses a Swin transformer, encoder, and decoder in a U -Net architecture with a classification branch to build a model that can classify and segment deepfakes. The segmentation process creates segmentation masks and helps train the classifier. We have evaluated our suggested method using the extensive, standard, and diverse FaceForensics++ (FF++) and the Celeb-DF dataset. The generalizability evaluation of our process, which is part of the performance evaluation, reveals the model's promising performance in identifying deepfakes videos generated using various methodologies on both large-scale datasets.