Attention learning with counterfactual intervention based on feature fusion for fine-grained feature learning

IF 2.9 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC
Ning Yu , Long Chen , Xiaoyin Yi , Jiacheng Huang
{"title":"Attention learning with counterfactual intervention based on feature fusion for fine-grained feature learning","authors":"Ning Yu ,&nbsp;Long Chen ,&nbsp;Xiaoyin Yi ,&nbsp;Jiacheng Huang","doi":"10.1016/j.dsp.2025.105215","DOIUrl":null,"url":null,"abstract":"<div><div>Deep learning models can learn features from a large amount of data and usually localize the overall region of the target object accurately in visual recognition tasks. However, in fine-grained scenarios with inter-class similarities, such as brand recognition in vehicles and subspecies recognition in organisms, there is a need to capture crucial distinct features and provide reliable explanations when tracking decision behavior. Therefore, this paper builds on the idea of counterfactual intervention in causal reasoning and proposes a counterfactual intervention of attention learning to learn feature information that plays an important role in fine-grained recognition tasks. First, we use the iterative feature fusion attention module that learns different levels of features and fuses them to capture the crucial features of the target object and suppress attention to the unimportant features. Second, we perform the counterfactual intervention on the feature fusion-based attention map. The changes produced by the intervening variables serve as monitoring signals for attentional learning to enhance the feature learning that contributes positively for the predicted result. Besides, we use the contrast learning function as a constraint to avoid focusing solely on salient features, thus enabling the network model to learn richer differential features. Finally, we use GradCAM visualization to explain the process of decision-making. The experimental results show that the method in this paper learned important distinguishable features of the target object, weakens the attention to non-critical regions, and offers reliable traceability analysis in tracing back decision-making behaviors.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"163 ","pages":"Article 105215"},"PeriodicalIF":2.9000,"publicationDate":"2025-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1051200425002374","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

Deep learning models can learn features from a large amount of data and usually localize the overall region of the target object accurately in visual recognition tasks. However, in fine-grained scenarios with inter-class similarities, such as brand recognition in vehicles and subspecies recognition in organisms, there is a need to capture crucial distinct features and provide reliable explanations when tracking decision behavior. Therefore, this paper builds on the idea of counterfactual intervention in causal reasoning and proposes a counterfactual intervention of attention learning to learn feature information that plays an important role in fine-grained recognition tasks. First, we use the iterative feature fusion attention module that learns different levels of features and fuses them to capture the crucial features of the target object and suppress attention to the unimportant features. Second, we perform the counterfactual intervention on the feature fusion-based attention map. The changes produced by the intervening variables serve as monitoring signals for attentional learning to enhance the feature learning that contributes positively for the predicted result. Besides, we use the contrast learning function as a constraint to avoid focusing solely on salient features, thus enabling the network model to learn richer differential features. Finally, we use GradCAM visualization to explain the process of decision-making. The experimental results show that the method in this paper learned important distinguishable features of the target object, weakens the attention to non-critical regions, and offers reliable traceability analysis in tracing back decision-making behaviors.

Abstract Image

基于反事实干预的细粒度特征融合注意学习
深度学习模型可以从大量的数据中学习特征,在视觉识别任务中通常可以准确地定位目标物体的整体区域。然而,在具有类间相似性的细粒度场景中,例如车辆的品牌识别和生物体的亚种识别,在跟踪决策行为时需要捕捉关键的独特特征并提供可靠的解释。因此,本文以因果推理中的反事实干预思想为基础,提出了一种反事实干预的注意学习方法,以学习细粒度识别任务中重要的特征信息。首先,我们使用迭代特征融合注意模块,学习不同层次的特征并融合它们,以捕获目标物体的关键特征,并抑制对不重要特征的注意。其次,对基于特征融合的注意图进行反事实干预。干预变量产生的变化作为注意学习的监测信号,增强特征学习,对预测结果有积极的贡献。此外,我们使用对比学习函数作为约束,避免只关注显著特征,从而使网络模型能够学习更丰富的差分特征。最后,我们使用GradCAM可视化来解释决策过程。实验结果表明,该方法学习了目标对象的重要可区分特征,弱化了对非关键区域的关注,为决策行为的追溯提供了可靠的可追溯性分析。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Digital Signal Processing
Digital Signal Processing 工程技术-工程:电子与电气
CiteScore
5.30
自引率
17.20%
发文量
435
审稿时长
66 days
期刊介绍: Digital Signal Processing: A Review Journal is one of the oldest and most established journals in the field of signal processing yet it aims to be the most innovative. The Journal invites top quality research articles at the frontiers of research in all aspects of signal processing. Our objective is to provide a platform for the publication of ground-breaking research in signal processing with both academic and industrial appeal. The journal has a special emphasis on statistical signal processing methodology such as Bayesian signal processing, and encourages articles on emerging applications of signal processing such as: • big data• machine learning• internet of things• information security• systems biology and computational biology,• financial time series analysis,• autonomous vehicles,• quantum computing,• neuromorphic engineering,• human-computer interaction and intelligent user interfaces,• environmental signal processing,• geophysical signal processing including seismic signal processing,• chemioinformatics and bioinformatics,• audio, visual and performance arts,• disaster management and prevention,• renewable energy,
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信