Attention learning with counterfactual intervention based on feature fusion for fine-grained feature learning

IF 2.9 3区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

Digital Signal Processing Pub Date : 2025-04-07 DOI:10.1016/j.dsp.2025.105215

Ning Yu , Long Chen , Xiaoyin Yi , Jiacheng Huang

{"title":"Attention learning with counterfactual intervention based on feature fusion for fine-grained feature learning","authors":"Ning Yu , Long Chen , Xiaoyin Yi , Jiacheng Huang","doi":"10.1016/j.dsp.2025.105215","DOIUrl":null,"url":null,"abstract":"<div><div>Deep learning models can learn features from a large amount of data and usually localize the overall region of the target object accurately in visual recognition tasks. However, in fine-grained scenarios with inter-class similarities, such as brand recognition in vehicles and subspecies recognition in organisms, there is a need to capture crucial distinct features and provide reliable explanations when tracking decision behavior. Therefore, this paper builds on the idea of counterfactual intervention in causal reasoning and proposes a counterfactual intervention of attention learning to learn feature information that plays an important role in fine-grained recognition tasks. First, we use the iterative feature fusion attention module that learns different levels of features and fuses them to capture the crucial features of the target object and suppress attention to the unimportant features. Second, we perform the counterfactual intervention on the feature fusion-based attention map. The changes produced by the intervening variables serve as monitoring signals for attentional learning to enhance the feature learning that contributes positively for the predicted result. Besides, we use the contrast learning function as a constraint to avoid focusing solely on salient features, thus enabling the network model to learn richer differential features. Finally, we use GradCAM visualization to explain the process of decision-making. The experimental results show that the method in this paper learned important distinguishable features of the target object, weakens the attention to non-critical regions, and offers reliable traceability analysis in tracing back decision-making behaviors.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"163 ","pages":"Article 105215"},"PeriodicalIF":2.9000,"publicationDate":"2025-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1051200425002374","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

Deep learning models can learn features from a large amount of data and usually localize the overall region of the target object accurately in visual recognition tasks. However, in fine-grained scenarios with inter-class similarities, such as brand recognition in vehicles and subspecies recognition in organisms, there is a need to capture crucial distinct features and provide reliable explanations when tracking decision behavior. Therefore, this paper builds on the idea of counterfactual intervention in causal reasoning and proposes a counterfactual intervention of attention learning to learn feature information that plays an important role in fine-grained recognition tasks. First, we use the iterative feature fusion attention module that learns different levels of features and fuses them to capture the crucial features of the target object and suppress attention to the unimportant features. Second, we perform the counterfactual intervention on the feature fusion-based attention map. The changes produced by the intervening variables serve as monitoring signals for attentional learning to enhance the feature learning that contributes positively for the predicted result. Besides, we use the contrast learning function as a constraint to avoid focusing solely on salient features, thus enabling the network model to learn richer differential features. Finally, we use GradCAM visualization to explain the process of decision-making. The experimental results show that the method in this paper learned important distinguishable features of the target object, weakens the attention to non-critical regions, and offers reliable traceability analysis in tracing back decision-making behaviors.

Abstract Image

查看原文本刊更多论文

基于反事实干预的细粒度特征融合注意学习

深度学习模型可以从大量的数据中学习特征，在视觉识别任务中通常可以准确地定位目标物体的整体区域。然而，在具有类间相似性的细粒度场景中，例如车辆的品牌识别和生物体的亚种识别，在跟踪决策行为时需要捕捉关键的独特特征并提供可靠的解释。因此，本文以因果推理中的反事实干预思想为基础，提出了一种反事实干预的注意学习方法，以学习细粒度识别任务中重要的特征信息。首先，我们使用迭代特征融合注意模块，学习不同层次的特征并融合它们，以捕获目标物体的关键特征，并抑制对不重要特征的注意。其次，对基于特征融合的注意图进行反事实干预。干预变量产生的变化作为注意学习的监测信号，增强特征学习，对预测结果有积极的贡献。此外，我们使用对比学习函数作为约束，避免只关注显著特征，从而使网络模型能够学习更丰富的差分特征。最后，我们使用GradCAM可视化来解释决策过程。实验结果表明，该方法学习了目标对象的重要可区分特征，弱化了对非关键区域的关注，为决策行为的追溯提供了可靠的可追溯性分析。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Digital Signal Processing 工程技术-工程：电子与电气

CiteScore

5.30

自引率

17.20%

发文量

435

审稿时长

66 days

期刊介绍： Digital Signal Processing: A Review Journal is one of the oldest and most established journals in the field of signal processing yet it aims to be the most innovative. The Journal invites top quality research articles at the frontiers of research in all aspects of signal processing. Our objective is to provide a platform for the publication of ground-breaking research in signal processing with both academic and industrial appeal. The journal has a special emphasis on statistical signal processing methodology such as Bayesian signal processing, and encourages articles on emerging applications of signal processing such as: • big data• machine learning• internet of things• information security• systems biology and computational biology,• financial time series analysis,• autonomous vehicles,• quantum computing,• neuromorphic engineering,• human-computer interaction and intelligent user interfaces,• environmental signal processing,• geophysical signal processing including seismic signal processing,• chemioinformatics and bioinformatics,• audio, visual and performance arts,• disaster management and prevention,• renewable energy,