Boyuan Liu;Xin Zhang;Hefei Ling;Zongyi Li;Runsheng Wang;Hanyuan Zhang;Ping Li
{"title":"AIM-Bone:用于广义深度伪造检测的纹理差异生成与定位","authors":"Boyuan Liu;Xin Zhang;Hefei Ling;Zongyi Li;Runsheng Wang;Hanyuan Zhang;Ping Li","doi":"10.1109/TBIOM.2025.3526655","DOIUrl":null,"url":null,"abstract":"Deep synthesis multimedia content, especially human face manipulation poses a risk of visual and auditory confusion, highlighting the call for generalized face forgery detection methods. In this paper, we propose a novel method for fake sample synthesis, along with a dual auto-encoder network for generalized deepfake detection. First, we delve into the texture discrepancy between tampered and unperturbed regions within forged images and impose models to learn such features by adopting Augmentation Inside Masks (AIM). It is capable of sabotaging the texture consistency within a single real image and generating textures that are commonly seen in fake images. It is realized by exhibiting forgery clues of discrepancy in noise patterns, colors, resolutions, and especially the existence of GAN (Generative Adversarial Network) features, including GAN textures, deconvolution traces, GAN distribution, etc. To the best of our knowledge, this work is the first to incorporate GAN features in fake sample synthesizing. The second is that we design a Bone-shaped dual auto-encoder with a powerful image texture filter bridged in between to aid forgery detection and localization in two streams. Reconstruction learning in the color stream avoids over-fitting in specific textures and imposes learning color-related features. Moreover, the GAN fingerprints harbored within the output image can be in furtherance of AIM and produce texture-discrepant samples for further training. The noise stream takes input processed by the proposed texture filter to focus on noise perspective and predict forgery region localization, subjecting to the constraint of mask label produced by AIM. We conduct extensive experiments on multiple benchmark datasets and the superior performance has proven the effectiveness of AIM-Bone and its advantage against current state-of-the-art methods. Our source code is available at <monospace><uri>https://github.com/heart74/AIM-Bone.git</uri></monospace>.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"7 3","pages":"422-431"},"PeriodicalIF":5.0000,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"AIM-Bone: Texture Discrepancy Generation and Localization for Generalized Deepfake Detection\",\"authors\":\"Boyuan Liu;Xin Zhang;Hefei Ling;Zongyi Li;Runsheng Wang;Hanyuan Zhang;Ping Li\",\"doi\":\"10.1109/TBIOM.2025.3526655\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep synthesis multimedia content, especially human face manipulation poses a risk of visual and auditory confusion, highlighting the call for generalized face forgery detection methods. In this paper, we propose a novel method for fake sample synthesis, along with a dual auto-encoder network for generalized deepfake detection. First, we delve into the texture discrepancy between tampered and unperturbed regions within forged images and impose models to learn such features by adopting Augmentation Inside Masks (AIM). It is capable of sabotaging the texture consistency within a single real image and generating textures that are commonly seen in fake images. It is realized by exhibiting forgery clues of discrepancy in noise patterns, colors, resolutions, and especially the existence of GAN (Generative Adversarial Network) features, including GAN textures, deconvolution traces, GAN distribution, etc. To the best of our knowledge, this work is the first to incorporate GAN features in fake sample synthesizing. The second is that we design a Bone-shaped dual auto-encoder with a powerful image texture filter bridged in between to aid forgery detection and localization in two streams. Reconstruction learning in the color stream avoids over-fitting in specific textures and imposes learning color-related features. Moreover, the GAN fingerprints harbored within the output image can be in furtherance of AIM and produce texture-discrepant samples for further training. The noise stream takes input processed by the proposed texture filter to focus on noise perspective and predict forgery region localization, subjecting to the constraint of mask label produced by AIM. We conduct extensive experiments on multiple benchmark datasets and the superior performance has proven the effectiveness of AIM-Bone and its advantage against current state-of-the-art methods. Our source code is available at <monospace><uri>https://github.com/heart74/AIM-Bone.git</uri></monospace>.\",\"PeriodicalId\":73307,\"journal\":{\"name\":\"IEEE transactions on biometrics, behavior, and identity science\",\"volume\":\"7 3\",\"pages\":\"422-431\"},\"PeriodicalIF\":5.0000,\"publicationDate\":\"2025-01-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on biometrics, behavior, and identity science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10829837/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on biometrics, behavior, and identity science","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10829837/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
AIM-Bone: Texture Discrepancy Generation and Localization for Generalized Deepfake Detection
Deep synthesis multimedia content, especially human face manipulation poses a risk of visual and auditory confusion, highlighting the call for generalized face forgery detection methods. In this paper, we propose a novel method for fake sample synthesis, along with a dual auto-encoder network for generalized deepfake detection. First, we delve into the texture discrepancy between tampered and unperturbed regions within forged images and impose models to learn such features by adopting Augmentation Inside Masks (AIM). It is capable of sabotaging the texture consistency within a single real image and generating textures that are commonly seen in fake images. It is realized by exhibiting forgery clues of discrepancy in noise patterns, colors, resolutions, and especially the existence of GAN (Generative Adversarial Network) features, including GAN textures, deconvolution traces, GAN distribution, etc. To the best of our knowledge, this work is the first to incorporate GAN features in fake sample synthesizing. The second is that we design a Bone-shaped dual auto-encoder with a powerful image texture filter bridged in between to aid forgery detection and localization in two streams. Reconstruction learning in the color stream avoids over-fitting in specific textures and imposes learning color-related features. Moreover, the GAN fingerprints harbored within the output image can be in furtherance of AIM and produce texture-discrepant samples for further training. The noise stream takes input processed by the proposed texture filter to focus on noise perspective and predict forgery region localization, subjecting to the constraint of mask label produced by AIM. We conduct extensive experiments on multiple benchmark datasets and the superior performance has proven the effectiveness of AIM-Bone and its advantage against current state-of-the-art methods. Our source code is available at https://github.com/heart74/AIM-Bone.git.