Cross-Layer Feature based Multi-Granularity Visual Classification

2022 IEEE International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2022-12-13 DOI:10.1109/VCIP56404.2022.10008879

Junhan Chen, Dongliang Chang, Jiyang Xie, Ruoyi Du, Zhanyu Ma

{"title":"Cross-Layer Feature based Multi-Granularity Visual Classification","authors":"Junhan Chen, Dongliang Chang, Jiyang Xie, Ruoyi Du, Zhanyu Ma","doi":"10.1109/VCIP56404.2022.10008879","DOIUrl":null,"url":null,"abstract":"In contrast to traditional fine-grained visual clas-sification, multi-granularity visual classification is no longer limited to identifying the different sub-classes belonging to the same super-class (e.g., bird species, cars, and aircraft models). Instead, it gives a sequence of labels from coarse to fine (e.g., Passeriformes → Corvidae → Fish Crow), which is more convenient in practice. The key to solving this task is how to use the relationships between the different levels of labels to learn feature representations that contain different levels of granularity. Interestingly, the feature pyramid structure naturally implies different granularity of feature representation, with the shallow layers representing coarse-grained features and the deep layers representing fine-grained features. Therefore, in this paper, we exploit this property of the feature pyramid structure to decouple features and obtain feature representations corre-sponding to different granularities. Specifically, we use shallow features for coarse-grained classification and deep features for fine-grained classification. In addition, to enable fine-grained features to enhance the coarse-grained classification, we propose a feature reinforcement module based on the feature pyramid structure, where deep features are first upsampled and then combined with shallow features to make decisions. Experimental results on three widely used fine-grained image classification datasets such as CUB-200-2011, Stanford Cars, and FGVC-Aircraft validate the method's effectiveness. Code available at https://github.com/PRIS-CV/CGVC.","PeriodicalId":269379,"journal":{"name":"2022 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Visual Communications and Image Processing (VCIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VCIP56404.2022.10008879","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

In contrast to traditional fine-grained visual clas-sification, multi-granularity visual classification is no longer limited to identifying the different sub-classes belonging to the same super-class (e.g., bird species, cars, and aircraft models). Instead, it gives a sequence of labels from coarse to fine (e.g., Passeriformes → Corvidae → Fish Crow), which is more convenient in practice. The key to solving this task is how to use the relationships between the different levels of labels to learn feature representations that contain different levels of granularity. Interestingly, the feature pyramid structure naturally implies different granularity of feature representation, with the shallow layers representing coarse-grained features and the deep layers representing fine-grained features. Therefore, in this paper, we exploit this property of the feature pyramid structure to decouple features and obtain feature representations corre-sponding to different granularities. Specifically, we use shallow features for coarse-grained classification and deep features for fine-grained classification. In addition, to enable fine-grained features to enhance the coarse-grained classification, we propose a feature reinforcement module based on the feature pyramid structure, where deep features are first upsampled and then combined with shallow features to make decisions. Experimental results on three widely used fine-grained image classification datasets such as CUB-200-2011, Stanford Cars, and FGVC-Aircraft validate the method's effectiveness. Code available at https://github.com/PRIS-CV/CGVC.

查看原文本刊更多论文

基于跨层特征的多粒度视觉分类

与传统的细粒度视觉分类相比，多粒度视觉分类不再局限于识别属于同一超类的不同子类(例如鸟类、汽车和飞机模型)。相反，它给出了一个从粗到细的标签序列(例如，passerformes→Corvidae→Fish Crow)，这在实践中更方便。解决这个问题的关键是如何利用不同级别标签之间的关系来学习包含不同粒度级别的特征表示。有趣的是，特征金字塔结构自然意味着不同粒度的特征表示，浅层表示粗粒度特征，深层表示细粒度特征。因此，本文利用特征金字塔结构的这一特性对特征进行解耦，得到不同粒度对应的特征表示。具体来说，我们使用浅特征进行粗粒度分类，使用深特征进行细粒度分类。此外，为了使细粒度特征能够增强粗粒度分类，我们提出了一种基于特征金字塔结构的特征增强模块，首先对深层特征进行上采样，然后结合浅层特征进行决策。在CUB-200-2011、Stanford Cars和FGVC-Aircraft三种广泛使用的细粒度图像分类数据集上的实验结果验证了该方法的有效性。代码可从https://github.com/PRIS-CV/CGVC获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 IEEE International Conference on Visual Communications and Image Processing (VCIP)

自引率

0.00%

发文量