IP-CAM: Class activation mapping based on importance weights and principal-component weights for better and simpler visual explanations

IF 3.5 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Computer Vision and Image Understanding Pub Date : 2025-10-09 DOI:10.1016/j.cviu.2025.104523

Wenyi Zhang, Haoran Zhang, Xisheng Zhang, Xiaohua Shen, Lejun Zou

{"title":"IP-CAM: Class activation mapping based on importance weights and principal-component weights for better and simpler visual explanations","authors":"Wenyi Zhang, Haoran Zhang, Xisheng Zhang, Xiaohua Shen, Lejun Zou","doi":"10.1016/j.cviu.2025.104523","DOIUrl":null,"url":null,"abstract":"<div><div>Visual explanations of deep neural networks (DNNs) have gained considerable importance in deep learning due to the lack of interpretability, which constrains human trust in DNNs. This paper proposes a new gradient-free class activation map (CAM) architecture called importance principal-component CAM (IP-CAM). The architecture not only improves the prediction accuracy of networks but also provides simpler and more reliable visual explanations. It adds importance weight layers before the classifier and assigns an importance weight to each activation map. After fine-tuning, it selects images with the highest prediction score for each class, performs principal component analysis (PCA) on activation maps of all channels, and regards the eigenvector of the first principal component as principal-component weights for that class. The final saliency map is obtained by linearly combining the activation maps, importance weights and principal-component weights. IP-CAM is evaluated on the ILSVRC 2012 dataset and RSD46-WHU dataset, whose results show that IP-CAM performs better than most previous CAM variants in recognition and localization tasks. Finally, the method is applied as a tool for interpretability, and the results illustrate that IP-CAM effectively unveils the decision-making process of DNNs through saliency maps.</div></div>","PeriodicalId":50633,"journal":{"name":"Computer Vision and Image Understanding","volume":"261 ","pages":"Article 104523"},"PeriodicalIF":3.5000,"publicationDate":"2025-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Vision and Image Understanding","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1077314225002462","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Visual explanations of deep neural networks (DNNs) have gained considerable importance in deep learning due to the lack of interpretability, which constrains human trust in DNNs. This paper proposes a new gradient-free class activation map (CAM) architecture called importance principal-component CAM (IP-CAM). The architecture not only improves the prediction accuracy of networks but also provides simpler and more reliable visual explanations. It adds importance weight layers before the classifier and assigns an importance weight to each activation map. After fine-tuning, it selects images with the highest prediction score for each class, performs principal component analysis (PCA) on activation maps of all channels, and regards the eigenvector of the first principal component as principal-component weights for that class. The final saliency map is obtained by linearly combining the activation maps, importance weights and principal-component weights. IP-CAM is evaluated on the ILSVRC 2012 dataset and RSD46-WHU dataset, whose results show that IP-CAM performs better than most previous CAM variants in recognition and localization tasks. Finally, the method is applied as a tool for interpretability, and the results illustrate that IP-CAM effectively unveils the decision-making process of DNNs through saliency maps.

查看原文本刊更多论文

IP-CAM：基于重要性权重和主成分权重的类激活映射，以获得更好、更简单的可视化解释

由于缺乏可解释性，深度神经网络（dnn）的视觉解释在深度学习中变得相当重要，这限制了人们对dnn的信任。本文提出了一种新的无梯度类激活图（CAM）结构，称为重要主成分激活图（IP-CAM）。该体系结构不仅提高了网络的预测精度，而且提供了更简单、更可靠的可视化解释。它在分类器之前添加重要性权重层，并为每个激活图分配一个重要性权重。经过微调后，选取每一类预测得分最高的图像，对所有通道的激活图进行主成分分析（PCA），并将第一个主成分的特征向量作为该类的主成分权重。将激活图、重要权值和主成分权值线性组合，得到最终的显著性图。在ILSVRC 2012数据集和RSD46-WHU数据集上对IP-CAM进行了评估，结果表明IP-CAM在识别和定位任务上的表现优于以往的大多数CAM变体。最后，将该方法作为可解释性工具进行了应用，结果表明IP-CAM通过显著性图有效地揭示了深度神经网络的决策过程。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computer Vision and Image Understanding 工程技术-工程：电子与电气

CiteScore

7.80

自引率

4.40%

发文量

112

审稿时长

79 days

期刊介绍： The central focus of this journal is the computer analysis of pictorial information. Computer Vision and Image Understanding publishes papers covering all aspects of image analysis from the low-level, iconic processes of early vision to the high-level, symbolic processes of recognition and interpretation. A wide range of topics in the image understanding area is covered, including papers offering insights that differ from predominant views. Research Areas Include: • Theory • Early vision • Data structures and representations • Shape • Range • Motion • Matching and recognition • Architecture and languages • Vision systems