A contrastive learning-based heterogeneous dual-branch network for source camera identification

IF 5.5 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neurocomputing Pub Date : 2025-05-15 DOI:10.1016/j.neucom.2025.130406

Zijuan Han , Yang Yang , Jinkai Zhang , Yang Li , Yunxia Liu , Ngai-Fong Bonnie Law

{"title":"A contrastive learning-based heterogeneous dual-branch network for source camera identification","authors":"Zijuan Han , Yang Yang , Jinkai Zhang , Yang Li , Yunxia Liu , Ngai-Fong Bonnie Law","doi":"10.1016/j.neucom.2025.130406","DOIUrl":null,"url":null,"abstract":"<div><div>Source camera identification has been a significant focus in image forensics over the past decades. However, as camera model and instance related forensic features are weak compared to image contents, identification performance is far from satisfactory for practical applications. This paper introduces a novel contrastive learning strategy, aimed at enhancing the learning of camera fingerprints by leveraging the similarity between the two branches in a heterogeneous dual-branch network. Initially, a heterogeneous dual-branch feature extraction module is designed, employing two distinct strategies: noise residual estimation and progressive direct estimation, to independently extract forensic information. Contrastive learning is then utilized to enhance shared forensic features related to camera models between the two branches while filtering out irrelevant content residuals. During training, in addition to supervised classification loss, both spatial and frequency losses are applied to ensure the features consistency between the two branches, thereby enhancing the similarity of the features learned by both branches in the spatial and frequency domains. Drawing inspiration from the peak correlation energy metric commonly used in traditional methods, a frequency domain correlation loss is proposed. Extensive experimental results on the Dresden and Vision datasets demonstrate that the proposed method outperforms state-of-the-art approaches. Furthermore, it shows improved robustness against common preprocessing attacks such as JPEG recompression and image resizing, making it more suitable for real-world applications.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"645 ","pages":"Article 130406"},"PeriodicalIF":5.5000,"publicationDate":"2025-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231225010781","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Source camera identification has been a significant focus in image forensics over the past decades. However, as camera model and instance related forensic features are weak compared to image contents, identification performance is far from satisfactory for practical applications. This paper introduces a novel contrastive learning strategy, aimed at enhancing the learning of camera fingerprints by leveraging the similarity between the two branches in a heterogeneous dual-branch network. Initially, a heterogeneous dual-branch feature extraction module is designed, employing two distinct strategies: noise residual estimation and progressive direct estimation, to independently extract forensic information. Contrastive learning is then utilized to enhance shared forensic features related to camera models between the two branches while filtering out irrelevant content residuals. During training, in addition to supervised classification loss, both spatial and frequency losses are applied to ensure the features consistency between the two branches, thereby enhancing the similarity of the features learned by both branches in the spatial and frequency domains. Drawing inspiration from the peak correlation energy metric commonly used in traditional methods, a frequency domain correlation loss is proposed. Extensive experimental results on the Dresden and Vision datasets demonstrate that the proposed method outperforms state-of-the-art approaches. Furthermore, it shows improved robustness against common preprocessing attacks such as JPEG recompression and image resizing, making it more suitable for real-world applications.

查看原文本刊更多论文

基于对比学习的异构双分支网络源摄像机识别

在过去的几十年里，源相机识别一直是图像取证的一个重要焦点。然而，与图像内容相比，相机模型和实例相关的取证特征较弱，因此在实际应用中识别性能远不能令人满意。本文介绍了一种新的对比学习策略，旨在利用异构双分支网络中两个分支之间的相似性来增强相机指纹的学习。首先，设计了异构双分支特征提取模块，采用噪声残差估计和渐进式直接估计两种不同的策略，独立提取取证信息。然后利用对比学习来增强两个分支之间与相机模型相关的共享取证特征，同时过滤掉不相关的内容残差。在训练过程中，除了使用监督分类损失外，还使用空间损失和频率损失来保证两个分支之间的特征一致性，从而增强两个分支在空间和频率域上学习到的特征的相似性。受传统方法中常用的峰值相关能量度量的启发，提出了一种频域相关损耗。在Dresden和Vision数据集上的大量实验结果表明，所提出的方法优于最先进的方法。此外，它对常见预处理攻击（如JPEG再压缩和图像大小调整）的鲁棒性得到了改进，使其更适合实际应用程序。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Neurocomputing 工程技术-计算机：人工智能

CiteScore

13.10

自引率

10.00%

发文量

1382

审稿时长

70 days

期刊介绍： Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.