Runtong Zhang, Fanman Meng, Hongliang Li, Q. Wu, K. Ngan
{"title":"使用公共属性标签挖掘更大的类激活图","authors":"Runtong Zhang, Fanman Meng, Hongliang Li, Q. Wu, K. Ngan","doi":"10.1109/VCIP49819.2020.9301872","DOIUrl":null,"url":null,"abstract":"Class Activation Map (CAM) is the visualization of target regions generated from classification networks. However, classification network trained by class-level labels only has high responses to a few features of objects and thus the network cannot discriminate the whole target. We think that original labels used in classification tasks are not enough to describe all features of the objects. If we annotate more detailed labels like class-agnostic attribute labels for each image, the network may be able to mine larger CAM. Motivated by this idea, we propose and design common attribute labels, which are lower-level labels summarized from original image-level categories to describe more details of the target. Moreover, it should be emphasized that our proposed labels have good generalization on unknown categories since attributes (such as head, body, etc.) in some categories (such as dog, cat, etc.) are common and class-agnostic. That is why we call our proposed labels as common attribute labels, which are lower-level and more general compared with traditional labels. We finish the annotation work based on the PASCAL VOC2012 dataset and design a new architecture to successfully classify these common attribute labels. Then after fusing features of attribute labels into original categories, our network can mine larger CAMs of objects. Our method achieves better CAM results in visual and higher evaluation scores compared with traditional methods.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Mining Larger Class Activation Map with Common Attribute Labels\",\"authors\":\"Runtong Zhang, Fanman Meng, Hongliang Li, Q. Wu, K. Ngan\",\"doi\":\"10.1109/VCIP49819.2020.9301872\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Class Activation Map (CAM) is the visualization of target regions generated from classification networks. However, classification network trained by class-level labels only has high responses to a few features of objects and thus the network cannot discriminate the whole target. We think that original labels used in classification tasks are not enough to describe all features of the objects. If we annotate more detailed labels like class-agnostic attribute labels for each image, the network may be able to mine larger CAM. Motivated by this idea, we propose and design common attribute labels, which are lower-level labels summarized from original image-level categories to describe more details of the target. Moreover, it should be emphasized that our proposed labels have good generalization on unknown categories since attributes (such as head, body, etc.) in some categories (such as dog, cat, etc.) are common and class-agnostic. That is why we call our proposed labels as common attribute labels, which are lower-level and more general compared with traditional labels. We finish the annotation work based on the PASCAL VOC2012 dataset and design a new architecture to successfully classify these common attribute labels. Then after fusing features of attribute labels into original categories, our network can mine larger CAMs of objects. Our method achieves better CAM results in visual and higher evaluation scores compared with traditional methods.\",\"PeriodicalId\":431880,\"journal\":{\"name\":\"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/VCIP49819.2020.9301872\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VCIP49819.2020.9301872","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Mining Larger Class Activation Map with Common Attribute Labels
Class Activation Map (CAM) is the visualization of target regions generated from classification networks. However, classification network trained by class-level labels only has high responses to a few features of objects and thus the network cannot discriminate the whole target. We think that original labels used in classification tasks are not enough to describe all features of the objects. If we annotate more detailed labels like class-agnostic attribute labels for each image, the network may be able to mine larger CAM. Motivated by this idea, we propose and design common attribute labels, which are lower-level labels summarized from original image-level categories to describe more details of the target. Moreover, it should be emphasized that our proposed labels have good generalization on unknown categories since attributes (such as head, body, etc.) in some categories (such as dog, cat, etc.) are common and class-agnostic. That is why we call our proposed labels as common attribute labels, which are lower-level and more general compared with traditional labels. We finish the annotation work based on the PASCAL VOC2012 dataset and design a new architecture to successfully classify these common attribute labels. Then after fusing features of attribute labels into original categories, our network can mine larger CAMs of objects. Our method achieves better CAM results in visual and higher evaluation scores compared with traditional methods.