Mining Larger Class Activation Map with Common Attribute Labels

2020 IEEE International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2020-12-01 DOI:10.1109/VCIP49819.2020.9301872

Runtong Zhang, Fanman Meng, Hongliang Li, Q. Wu, K. Ngan

{"title":"Mining Larger Class Activation Map with Common Attribute Labels","authors":"Runtong Zhang, Fanman Meng, Hongliang Li, Q. Wu, K. Ngan","doi":"10.1109/VCIP49819.2020.9301872","DOIUrl":null,"url":null,"abstract":"Class Activation Map (CAM) is the visualization of target regions generated from classification networks. However, classification network trained by class-level labels only has high responses to a few features of objects and thus the network cannot discriminate the whole target. We think that original labels used in classification tasks are not enough to describe all features of the objects. If we annotate more detailed labels like class-agnostic attribute labels for each image, the network may be able to mine larger CAM. Motivated by this idea, we propose and design common attribute labels, which are lower-level labels summarized from original image-level categories to describe more details of the target. Moreover, it should be emphasized that our proposed labels have good generalization on unknown categories since attributes (such as head, body, etc.) in some categories (such as dog, cat, etc.) are common and class-agnostic. That is why we call our proposed labels as common attribute labels, which are lower-level and more general compared with traditional labels. We finish the annotation work based on the PASCAL VOC2012 dataset and design a new architecture to successfully classify these common attribute labels. Then after fusing features of attribute labels into original categories, our network can mine larger CAMs of objects. Our method achieves better CAM results in visual and higher evaluation scores compared with traditional methods.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"147 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VCIP49819.2020.9301872","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Class Activation Map (CAM) is the visualization of target regions generated from classification networks. However, classification network trained by class-level labels only has high responses to a few features of objects and thus the network cannot discriminate the whole target. We think that original labels used in classification tasks are not enough to describe all features of the objects. If we annotate more detailed labels like class-agnostic attribute labels for each image, the network may be able to mine larger CAM. Motivated by this idea, we propose and design common attribute labels, which are lower-level labels summarized from original image-level categories to describe more details of the target. Moreover, it should be emphasized that our proposed labels have good generalization on unknown categories since attributes (such as head, body, etc.) in some categories (such as dog, cat, etc.) are common and class-agnostic. That is why we call our proposed labels as common attribute labels, which are lower-level and more general compared with traditional labels. We finish the annotation work based on the PASCAL VOC2012 dataset and design a new architecture to successfully classify these common attribute labels. Then after fusing features of attribute labels into original categories, our network can mine larger CAMs of objects. Our method achieves better CAM results in visual and higher evaluation scores compared with traditional methods.

查看原文本刊更多论文

使用公共属性标签挖掘更大的类激活图

类激活图(Class Activation Map, CAM)是由分类网络生成的目标区域的可视化。然而，类级标签训练的分类网络仅对目标的少数特征有较高的响应，不能区分整个目标。我们认为分类任务中使用的原始标签不足以描述对象的所有特征。如果我们为每张图像标注更详细的标签，如类别无关属性标签，网络可能能够挖掘更大的CAM。在这一思想的推动下，我们提出并设计了通用属性标签，它是从原始图像级类别中总结出来的低级标签，用于描述目标的更多细节。此外，应该强调的是，我们提出的标签对未知类别有很好的泛化，因为某些类别(如狗、猫等)的属性(如头、身体等)是常见的，并且是类不可知论的。这就是为什么我们将建议的标签称为公共属性标签的原因，与传统标签相比，它级别更低，更通用。我们基于PASCAL VOC2012数据集完成了标注工作，并设计了一种新的架构来成功地对这些通用属性标签进行分类。然后将属性标签的特征融合到原始分类中，我们的网络可以挖掘出更大的物体的cam。与传统方法相比，我们的方法在视觉上取得了更好的CAM效果，并且获得了更高的评价分数。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)

自引率

0.00%

发文量