AIM-Bone：用于广义深度伪造检测的纹理差异生成与定位

IF 5

IEEE transactions on biometrics, behavior, and identity science Pub Date : 2025-01-06 DOI:10.1109/TBIOM.2025.3526655

Boyuan Liu;Xin Zhang;Hefei Ling;Zongyi Li;Runsheng Wang;Hanyuan Zhang;Ping Li

{"title":"AIM-Bone：用于广义深度伪造检测的纹理差异生成与定位","authors":"Boyuan Liu;Xin Zhang;Hefei Ling;Zongyi Li;Runsheng Wang;Hanyuan Zhang;Ping Li","doi":"10.1109/TBIOM.2025.3526655","DOIUrl":null,"url":null,"abstract":"Deep synthesis multimedia content, especially human face manipulation poses a risk of visual and auditory confusion, highlighting the call for generalized face forgery detection methods. In this paper, we propose a novel method for fake sample synthesis, along with a dual auto-encoder network for generalized deepfake detection. First, we delve into the texture discrepancy between tampered and unperturbed regions within forged images and impose models to learn such features by adopting Augmentation Inside Masks (AIM). It is capable of sabotaging the texture consistency within a single real image and generating textures that are commonly seen in fake images. It is realized by exhibiting forgery clues of discrepancy in noise patterns, colors, resolutions, and especially the existence of GAN (Generative Adversarial Network) features, including GAN textures, deconvolution traces, GAN distribution, etc. To the best of our knowledge, this work is the first to incorporate GAN features in fake sample synthesizing. The second is that we design a Bone-shaped dual auto-encoder with a powerful image texture filter bridged in between to aid forgery detection and localization in two streams. Reconstruction learning in the color stream avoids over-fitting in specific textures and imposes learning color-related features. Moreover, the GAN fingerprints harbored within the output image can be in furtherance of AIM and produce texture-discrepant samples for further training. The noise stream takes input processed by the proposed texture filter to focus on noise perspective and predict forgery region localization, subjecting to the constraint of mask label produced by AIM. We conduct extensive experiments on multiple benchmark datasets and the superior performance has proven the effectiveness of AIM-Bone and its advantage against current state-of-the-art methods. Our source code is available at <monospace><uri>https://github.com/heart74/AIM-Bone.git</uri></monospace>.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"7 3","pages":"422-431"},"PeriodicalIF":5.0000,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"AIM-Bone: Texture Discrepancy Generation and Localization for Generalized Deepfake Detection\",\"authors\":\"Boyuan Liu;Xin Zhang;Hefei Ling;Zongyi Li;Runsheng Wang;Hanyuan Zhang;Ping Li\",\"doi\":\"10.1109/TBIOM.2025.3526655\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep synthesis multimedia content, especially human face manipulation poses a risk of visual and auditory confusion, highlighting the call for generalized face forgery detection methods. In this paper, we propose a novel method for fake sample synthesis, along with a dual auto-encoder network for generalized deepfake detection. First, we delve into the texture discrepancy between tampered and unperturbed regions within forged images and impose models to learn such features by adopting Augmentation Inside Masks (AIM). It is capable of sabotaging the texture consistency within a single real image and generating textures that are commonly seen in fake images. It is realized by exhibiting forgery clues of discrepancy in noise patterns, colors, resolutions, and especially the existence of GAN (Generative Adversarial Network) features, including GAN textures, deconvolution traces, GAN distribution, etc. To the best of our knowledge, this work is the first to incorporate GAN features in fake sample synthesizing. The second is that we design a Bone-shaped dual auto-encoder with a powerful image texture filter bridged in between to aid forgery detection and localization in two streams. Reconstruction learning in the color stream avoids over-fitting in specific textures and imposes learning color-related features. Moreover, the GAN fingerprints harbored within the output image can be in furtherance of AIM and produce texture-discrepant samples for further training. The noise stream takes input processed by the proposed texture filter to focus on noise perspective and predict forgery region localization, subjecting to the constraint of mask label produced by AIM. We conduct extensive experiments on multiple benchmark datasets and the superior performance has proven the effectiveness of AIM-Bone and its advantage against current state-of-the-art methods. Our source code is available at <monospace><uri>https://github.com/heart74/AIM-Bone.git</uri></monospace>.\",\"PeriodicalId\":73307,\"journal\":{\"name\":\"IEEE transactions on biometrics, behavior, and identity science\",\"volume\":\"7 3\",\"pages\":\"422-431\"},\"PeriodicalIF\":5.0000,\"publicationDate\":\"2025-01-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on biometrics, behavior, and identity science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10829837/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on biometrics, behavior, and identity science","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10829837/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

深度合成的多媒体内容，特别是人脸操作带来了视觉和听觉混淆的风险，突出了对广义人脸伪造检测方法的需求。在本文中，我们提出了一种新的假样本合成方法，以及用于广义深度假检测的双自编码器网络。首先，我们深入研究伪造图像中篡改区域和未篡改区域之间的纹理差异，并采用增强内掩模（AIM）强加模型来学习这些特征。它能够破坏单个真实图像中的纹理一致性，并生成在假图像中常见的纹理。它是通过展示噪声模式，颜色，分辨率差异的伪造线索来实现的，特别是GAN（生成对抗网络）特征的存在，包括GAN纹理，反卷积痕迹，GAN分布等。据我们所知，这项工作是第一次将GAN特征融入假样品合成中。其次，我们设计了一个骨骼形状的双自编码器，其中有一个强大的图像纹理过滤器，以帮助在两个流中进行伪造检测和定位。颜色流中的重建学习避免了对特定纹理的过度拟合，并强制学习与颜色相关的特征。此外，包含在输出图像中的GAN指纹可以进一步促进AIM，并产生纹理差异样本以供进一步训练。噪声流在AIM产生的掩模标签约束下，以纹理滤波器处理后的输入聚焦噪声视角，预测伪造区域定位。我们在多个基准数据集上进行了广泛的实验，卓越的性能证明了AIM-Bone的有效性及其相对于当前最先进方法的优势。我们的源代码可从https://github.com/heart74/AIM-Bone.git获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

AIM-Bone: Texture Discrepancy Generation and Localization for Generalized Deepfake Detection

Deep synthesis multimedia content, especially human face manipulation poses a risk of visual and auditory confusion, highlighting the call for generalized face forgery detection methods. In this paper, we propose a novel method for fake sample synthesis, along with a dual auto-encoder network for generalized deepfake detection. First, we delve into the texture discrepancy between tampered and unperturbed regions within forged images and impose models to learn such features by adopting Augmentation Inside Masks (AIM). It is capable of sabotaging the texture consistency within a single real image and generating textures that are commonly seen in fake images. It is realized by exhibiting forgery clues of discrepancy in noise patterns, colors, resolutions, and especially the existence of GAN (Generative Adversarial Network) features, including GAN textures, deconvolution traces, GAN distribution, etc. To the best of our knowledge, this work is the first to incorporate GAN features in fake sample synthesizing. The second is that we design a Bone-shaped dual auto-encoder with a powerful image texture filter bridged in between to aid forgery detection and localization in two streams. Reconstruction learning in the color stream avoids over-fitting in specific textures and imposes learning color-related features. Moreover, the GAN fingerprints harbored within the output image can be in furtherance of AIM and produce texture-discrepant samples for further training. The noise stream takes input processed by the proposed texture filter to focus on noise perspective and predict forgery region localization, subjecting to the constraint of mask label produced by AIM. We conduct extensive experiments on multiple benchmark datasets and the superior performance has proven the effectiveness of AIM-Bone and its advantage against current state-of-the-art methods. Our source code is available at https://github.com/heart74/AIM-Bone.git.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE transactions on biometrics, behavior, and identity science

CiteScore

10.90

自引率

0.00%

发文量