Hao Cheng, Weiye Pang, Kun Li, Yongzhuang Wei, Yuhang Song, Ji Chen
{"title":"EFIMD-Net:增强特征交互和多域融合深度伪造检测网络。","authors":"Hao Cheng, Weiye Pang, Kun Li, Yongzhuang Wei, Yuhang Song, Ji Chen","doi":"10.3390/jimaging11090312","DOIUrl":null,"url":null,"abstract":"<p><p>Currently, deepfake detection has garnered widespread attention as a key defense mechanism against the misuse of deepfake technology. However, existing deepfake detection networks still face challenges such as insufficient robustness, limited generalization capabilities, and a single feature extraction domain (e.g., using only spatial domain features) when confronted with evolving algorithms or diverse datasets, which severely limits their application capabilities. To address these issues, this study proposes a deepfake detection network named EFIMD-Net, which enhances performance by strengthening feature interaction and integrating spatial and frequency domain features. The proposed network integrates a Cross-feature Interaction Enhancement module (CFIE) based on cosine similarity, which achieves adaptive interaction between spatial domain features (RGB stream) and frequency domain features (SRM, Spatial Rich Model stream) through a channel attention mechanism, effectively fusing macro-semantic information with high-frequency artifact information. Additionally, an Enhanced Multi-scale Feature Fusion (EMFF) module is proposed, which effectively integrates multi-scale feature information from various layers of the network through adaptive feature enhancement and reorganization techniques. Experimental results show that compared to the baseline network Xception, EFIMD-Net achieves comparable or even better Area Under the Curve (AUC) on multiple datasets. Ablation experiments also validate the effectiveness of the proposed modules. Furthermore, compared to the baseline traditional two-stream network Locate and Verify, EFIMD-Net significantly improves forgery detection performance, with a 9-percentage-point increase in Area Under the Curve on the CelebDF-v1 dataset and a 7-percentage-point increase on the CelebDF-v2 dataset. These results fully demonstrate the effectiveness and generalization of EFIMD-Net in forgery detection. Potential limitations regarding real-time processing efficiency are acknowledged.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"11 9","pages":""},"PeriodicalIF":2.7000,"publicationDate":"2025-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12471254/pdf/","citationCount":"0","resultStr":"{\"title\":\"EFIMD-Net: Enhanced Feature Interaction and Multi-Domain Fusion Deep Forgery Detection Network.\",\"authors\":\"Hao Cheng, Weiye Pang, Kun Li, Yongzhuang Wei, Yuhang Song, Ji Chen\",\"doi\":\"10.3390/jimaging11090312\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Currently, deepfake detection has garnered widespread attention as a key defense mechanism against the misuse of deepfake technology. However, existing deepfake detection networks still face challenges such as insufficient robustness, limited generalization capabilities, and a single feature extraction domain (e.g., using only spatial domain features) when confronted with evolving algorithms or diverse datasets, which severely limits their application capabilities. To address these issues, this study proposes a deepfake detection network named EFIMD-Net, which enhances performance by strengthening feature interaction and integrating spatial and frequency domain features. The proposed network integrates a Cross-feature Interaction Enhancement module (CFIE) based on cosine similarity, which achieves adaptive interaction between spatial domain features (RGB stream) and frequency domain features (SRM, Spatial Rich Model stream) through a channel attention mechanism, effectively fusing macro-semantic information with high-frequency artifact information. Additionally, an Enhanced Multi-scale Feature Fusion (EMFF) module is proposed, which effectively integrates multi-scale feature information from various layers of the network through adaptive feature enhancement and reorganization techniques. Experimental results show that compared to the baseline network Xception, EFIMD-Net achieves comparable or even better Area Under the Curve (AUC) on multiple datasets. Ablation experiments also validate the effectiveness of the proposed modules. Furthermore, compared to the baseline traditional two-stream network Locate and Verify, EFIMD-Net significantly improves forgery detection performance, with a 9-percentage-point increase in Area Under the Curve on the CelebDF-v1 dataset and a 7-percentage-point increase on the CelebDF-v2 dataset. These results fully demonstrate the effectiveness and generalization of EFIMD-Net in forgery detection. Potential limitations regarding real-time processing efficiency are acknowledged.</p>\",\"PeriodicalId\":37035,\"journal\":{\"name\":\"Journal of Imaging\",\"volume\":\"11 9\",\"pages\":\"\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2025-09-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12471254/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Imaging\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3390/jimaging11090312\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Imaging","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/jimaging11090312","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY","Score":null,"Total":0}
引用次数: 0
摘要
目前,深度伪造检测作为防范深度伪造技术被滥用的关键防御机制受到了广泛关注。然而,现有的深度伪造检测网络在面对不断发展的算法或多样化的数据集时,仍然面临鲁棒性不足、泛化能力有限、特征提取领域单一(例如仅使用空间域特征)等挑战,严重限制了其应用能力。为了解决这些问题,本研究提出了一种名为EFIMD-Net的深度伪造检测网络,该网络通过加强特征交互和整合空间和频域特征来提高性能。该网络集成了基于余弦相似度的跨特征交互增强模块(CFIE),通过通道关注机制实现了空间域特征(RGB流)与频域特征(SRM, spatial Rich Model流)之间的自适应交互,有效融合了宏语义信息与高频伪信息。此外,提出了一种增强多尺度特征融合(Enhanced Multi-scale Feature Fusion, EMFF)模块,通过自适应特征增强和重组技术,有效地集成了网络各层的多尺度特征信息。实验结果表明,与基线网络异常相比,EFIMD-Net在多个数据集上实现了相当甚至更好的曲线下面积(Area Under the Curve, AUC)。烧蚀实验也验证了所提模块的有效性。此外,与传统的基线双流网络Locate and Verify相比,EFIMD-Net显著提高了伪造检测性能,在CelebDF-v1数据集上的曲线下面积(Area Under the Curve)提高了9个百分点,在CelebDF-v2数据集上提高了7个百分点。这些结果充分证明了EFIMD-Net在伪造检测中的有效性和通用性。承认实时处理效率方面的潜在限制。
EFIMD-Net: Enhanced Feature Interaction and Multi-Domain Fusion Deep Forgery Detection Network.
Currently, deepfake detection has garnered widespread attention as a key defense mechanism against the misuse of deepfake technology. However, existing deepfake detection networks still face challenges such as insufficient robustness, limited generalization capabilities, and a single feature extraction domain (e.g., using only spatial domain features) when confronted with evolving algorithms or diverse datasets, which severely limits their application capabilities. To address these issues, this study proposes a deepfake detection network named EFIMD-Net, which enhances performance by strengthening feature interaction and integrating spatial and frequency domain features. The proposed network integrates a Cross-feature Interaction Enhancement module (CFIE) based on cosine similarity, which achieves adaptive interaction between spatial domain features (RGB stream) and frequency domain features (SRM, Spatial Rich Model stream) through a channel attention mechanism, effectively fusing macro-semantic information with high-frequency artifact information. Additionally, an Enhanced Multi-scale Feature Fusion (EMFF) module is proposed, which effectively integrates multi-scale feature information from various layers of the network through adaptive feature enhancement and reorganization techniques. Experimental results show that compared to the baseline network Xception, EFIMD-Net achieves comparable or even better Area Under the Curve (AUC) on multiple datasets. Ablation experiments also validate the effectiveness of the proposed modules. Furthermore, compared to the baseline traditional two-stream network Locate and Verify, EFIMD-Net significantly improves forgery detection performance, with a 9-percentage-point increase in Area Under the Curve on the CelebDF-v1 dataset and a 7-percentage-point increase on the CelebDF-v2 dataset. These results fully demonstrate the effectiveness and generalization of EFIMD-Net in forgery detection. Potential limitations regarding real-time processing efficiency are acknowledged.