Jie Du;Wenbing Chen;Chi-Man Vong;Peng Liu;Tianfu Wang
{"title":"Context-CAM: Context-Level Weight-Based CAM With Sequential Denoising to Generate High-Quality Class Activation Maps","authors":"Jie Du;Wenbing Chen;Chi-Man Vong;Peng Liu;Tianfu Wang","doi":"10.1109/TIP.2025.3573509","DOIUrl":null,"url":null,"abstract":"Class activation mapping (CAM) methods have garnered considerable research attention because they can be used to interpret the decision-making of deep convolutional neural network (CNN) models and provide initial masks for weakly supervised semantic segmentation (WSSS) tasks. However, the class activation maps generated by most CAM methods usually have two limitations: 1) a lack of the ability to cover the whole object when using low-level features; and 2) introducing background noise. To mitigate these issues, an innovative <italic>Context-level weights-based CAM</i> (Context-CAM) method is proposed, which guarantees: 1) the non-discriminative regions that have similar appearances and are located close to the discriminative regions can also be highlighted by the newly designed <italic>Region-Enhanced Mapping</i> (REM) module using context-level weights; and 2) the background noises are gradually eliminated via a newly proposed <italic>Semantic-guided Reverse Sequence Fusion</i> (SRSF) strategy that can sequentially denoise and fuse the region-enhanced maps from the last layer to the first layer. Extensive experimental results show that our Context-CAM can generate higher-quality class activation maps than classic and state-of-the-art (SOTA) CAM methods in terms of the Energy-Based Pointing Game (EBPG) score, and the improvements are up to 35.49% when compared to the second-best method. Moreover, for WSSS tasks, our Context-CAM can directly replace the CAM method used in existing WSSS methods without any architectural modification to further improve the segmentation performance. Our code is available at <uri>https://github.com/cwb0611/Context-CAM</uri>.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"3431-3446"},"PeriodicalIF":13.7000,"publicationDate":"2025-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/11021331/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Class activation mapping (CAM) methods have garnered considerable research attention because they can be used to interpret the decision-making of deep convolutional neural network (CNN) models and provide initial masks for weakly supervised semantic segmentation (WSSS) tasks. However, the class activation maps generated by most CAM methods usually have two limitations: 1) a lack of the ability to cover the whole object when using low-level features; and 2) introducing background noise. To mitigate these issues, an innovative Context-level weights-based CAM (Context-CAM) method is proposed, which guarantees: 1) the non-discriminative regions that have similar appearances and are located close to the discriminative regions can also be highlighted by the newly designed Region-Enhanced Mapping (REM) module using context-level weights; and 2) the background noises are gradually eliminated via a newly proposed Semantic-guided Reverse Sequence Fusion (SRSF) strategy that can sequentially denoise and fuse the region-enhanced maps from the last layer to the first layer. Extensive experimental results show that our Context-CAM can generate higher-quality class activation maps than classic and state-of-the-art (SOTA) CAM methods in terms of the Energy-Based Pointing Game (EBPG) score, and the improvements are up to 35.49% when compared to the second-best method. Moreover, for WSSS tasks, our Context-CAM can directly replace the CAM method used in existing WSSS methods without any architectural modification to further improve the segmentation performance. Our code is available at https://github.com/cwb0611/Context-CAM.