Fine-grained Image Recognition via Attention Interaction and Counterfactual Attention Network

IF 8 2区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

Engineering Applications of Artificial Intelligence Pub Date : 2023-10-01 DOI:10.1016/j.engappai.2023.106735

Lei Huang , Chen An , Xiaodong Wang, Leon Bevan Bullock, Zhiqiang Wei

{"title":"Fine-grained Image Recognition via Attention Interaction and Counterfactual Attention Network","authors":"Lei Huang , Chen An , Xiaodong Wang, Leon Bevan Bullock, Zhiqiang Wei","doi":"10.1016/j.engappai.2023.106735","DOIUrl":null,"url":null,"abstract":"<div><p><span><span>Learning subtle and discriminative regions plays an important role in fine-grained image recognition, and attention mechanisms have shown great potential in such tasks. Recent research mainly focuses on employing the attention mechanism to locate key discriminative regions and learn </span>salient features<span>, whilst ignoring imperceptible complementary features and the causal relationship between prediction results and attention. To address the above issues, we propose an Attention Interaction and Counterfactual Attention Network (AICA-Net). Specifically, we propose an Attention Interaction Fusion Module (AIFM) to model the negative correlation between the attention map channels to locate the complementary features, and fuse the complementary features and key </span></span>discriminative features to generate richer fine-grained features. Simultaneously, an Enhanced Counterfactual Attention Module (ECAM) is proposed to generate a counterfactual attention map. By comparing the impact of the learned attention map and the counterfactual attention map on the final prediction results, quantifying the quality of attention drives the network to learn more effective attention. Extensive experiments on CUB-200-2011, FGVC-Aircraft and Stanford Cars datasets have shown that our AICA-Net can get outstanding results. In particular, it achieves 90.83% and 95.87% accuracy on two open competitive benchmark datasets CUB-200-2011 and Stanford Cars, respectively. Experiments demonstrate that our method outperforms state-of-the-art solutions.</p></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"125 ","pages":"Article 106735"},"PeriodicalIF":8.0000,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952197623009193","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Learning subtle and discriminative regions plays an important role in fine-grained image recognition, and attention mechanisms have shown great potential in such tasks. Recent research mainly focuses on employing the attention mechanism to locate key discriminative regions and learn salient features, whilst ignoring imperceptible complementary features and the causal relationship between prediction results and attention. To address the above issues, we propose an Attention Interaction and Counterfactual Attention Network (AICA-Net). Specifically, we propose an Attention Interaction Fusion Module (AIFM) to model the negative correlation between the attention map channels to locate the complementary features, and fuse the complementary features and key discriminative features to generate richer fine-grained features. Simultaneously, an Enhanced Counterfactual Attention Module (ECAM) is proposed to generate a counterfactual attention map. By comparing the impact of the learned attention map and the counterfactual attention map on the final prediction results, quantifying the quality of attention drives the network to learn more effective attention. Extensive experiments on CUB-200-2011, FGVC-Aircraft and Stanford Cars datasets have shown that our AICA-Net can get outstanding results. In particular, it achieves 90.83% and 95.87% accuracy on two open competitive benchmark datasets CUB-200-2011 and Stanford Cars, respectively. Experiments demonstrate that our method outperforms state-of-the-art solutions.

查看原文本刊更多论文

基于注意力交互和反事实注意力网络的细粒度图像识别

学习细微和判别区域在细粒度图像识别中起着重要作用，注意机制在这类任务中显示出巨大的潜力。最近的研究主要集中在利用注意机制定位关键的判别区域和学习显著特征，而忽略了不可察觉的互补特征以及预测结果与注意之间的因果关系。为了解决上述问题，我们提出了一个注意交互和反事实注意网络(AICA-Net)。具体而言，我们提出了一个注意交互融合模块(AIFM)，对注意图通道之间的负相关关系进行建模，定位互补特征，并将互补特征与关键判别特征融合，生成更丰富的细粒度特征。同时，提出了一种增强的反事实注意模块(ECAM)来生成反事实注意图。通过比较学习到的注意图和反事实注意图对最终预测结果的影响，量化注意质量驱动网络学习更有效的注意。在ub -200-2011、FGVC-Aircraft和Stanford Cars数据集上的大量实验表明，我们的AICA-Net可以获得出色的结果。特别是在两个开放的竞争性基准数据集CUB-200-2011和Stanford Cars上，准确率分别达到90.83%和95.87%。实验表明，我们的方法优于最先进的解决方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Engineering Applications of Artificial Intelligence 工程技术-工程：电子与电气

CiteScore

9.60

自引率

10.00%

发文量

505

审稿时长

68 days

期刊介绍： Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.