Cross-Layer Feature based Multi-Granularity Visual Classification

Junhan Chen, Dongliang Chang, Jiyang Xie, Ruoyi Du, Zhanyu Ma
{"title":"Cross-Layer Feature based Multi-Granularity Visual Classification","authors":"Junhan Chen, Dongliang Chang, Jiyang Xie, Ruoyi Du, Zhanyu Ma","doi":"10.1109/VCIP56404.2022.10008879","DOIUrl":null,"url":null,"abstract":"In contrast to traditional fine-grained visual clas-sification, multi-granularity visual classification is no longer limited to identifying the different sub-classes belonging to the same super-class (e.g., bird species, cars, and aircraft models). Instead, it gives a sequence of labels from coarse to fine (e.g., Passeriformes → Corvidae → Fish Crow), which is more convenient in practice. The key to solving this task is how to use the relationships between the different levels of labels to learn feature representations that contain different levels of granularity. Interestingly, the feature pyramid structure naturally implies different granularity of feature representation, with the shallow layers representing coarse-grained features and the deep layers representing fine-grained features. Therefore, in this paper, we exploit this property of the feature pyramid structure to decouple features and obtain feature representations corre-sponding to different granularities. Specifically, we use shallow features for coarse-grained classification and deep features for fine-grained classification. In addition, to enable fine-grained features to enhance the coarse-grained classification, we propose a feature reinforcement module based on the feature pyramid structure, where deep features are first upsampled and then combined with shallow features to make decisions. Experimental results on three widely used fine-grained image classification datasets such as CUB-200-2011, Stanford Cars, and FGVC-Aircraft validate the method's effectiveness. Code available at https://github.com/PRIS-CV/CGVC.","PeriodicalId":269379,"journal":{"name":"2022 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Visual Communications and Image Processing (VCIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VCIP56404.2022.10008879","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

In contrast to traditional fine-grained visual clas-sification, multi-granularity visual classification is no longer limited to identifying the different sub-classes belonging to the same super-class (e.g., bird species, cars, and aircraft models). Instead, it gives a sequence of labels from coarse to fine (e.g., Passeriformes → Corvidae → Fish Crow), which is more convenient in practice. The key to solving this task is how to use the relationships between the different levels of labels to learn feature representations that contain different levels of granularity. Interestingly, the feature pyramid structure naturally implies different granularity of feature representation, with the shallow layers representing coarse-grained features and the deep layers representing fine-grained features. Therefore, in this paper, we exploit this property of the feature pyramid structure to decouple features and obtain feature representations corre-sponding to different granularities. Specifically, we use shallow features for coarse-grained classification and deep features for fine-grained classification. In addition, to enable fine-grained features to enhance the coarse-grained classification, we propose a feature reinforcement module based on the feature pyramid structure, where deep features are first upsampled and then combined with shallow features to make decisions. Experimental results on three widely used fine-grained image classification datasets such as CUB-200-2011, Stanford Cars, and FGVC-Aircraft validate the method's effectiveness. Code available at https://github.com/PRIS-CV/CGVC.
基于跨层特征的多粒度视觉分类
与传统的细粒度视觉分类相比,多粒度视觉分类不再局限于识别属于同一超类的不同子类(例如鸟类、汽车和飞机模型)。相反,它给出了一个从粗到细的标签序列(例如,passerformes→Corvidae→Fish Crow),这在实践中更方便。解决这个问题的关键是如何利用不同级别标签之间的关系来学习包含不同粒度级别的特征表示。有趣的是,特征金字塔结构自然意味着不同粒度的特征表示,浅层表示粗粒度特征,深层表示细粒度特征。因此,本文利用特征金字塔结构的这一特性对特征进行解耦,得到不同粒度对应的特征表示。具体来说,我们使用浅特征进行粗粒度分类,使用深特征进行细粒度分类。此外,为了使细粒度特征能够增强粗粒度分类,我们提出了一种基于特征金字塔结构的特征增强模块,首先对深层特征进行上采样,然后结合浅层特征进行决策。在CUB-200-2011、Stanford Cars和FGVC-Aircraft三种广泛使用的细粒度图像分类数据集上的实验结果验证了该方法的有效性。代码可从https://github.com/PRIS-CV/CGVC获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信