{"title":"Cross-Granularity Fusion Network for Fine-Grained Image Classification","authors":"Wenjin Pang, Wei Song","doi":"10.1109/AINIT59027.2023.10212436","DOIUrl":null,"url":null,"abstract":"Fine-grained image classification (FGIC) aims to identify subtle visual differences among subcategories, which is challenging due to the small inter-class variances. Existing methods recognize subcategories mainly by locating discriminative parts which exists in the regions with high responses in deep feature maps. However, the regions with high responses in deep feature maps correspond to large receptive fields in the input image, leading to the result that subtle visual differences among subcategories cannot be captured precisely. In this paper we propose a novel Cross-Granularity Fusion Network (CGFN), which excavates subtle yet discriminative granularity features within each part and captures potential interactions among granularity features to build powerful part feature representations. The CGFN consists of two modules: First, the Multi-Granularity Proposal (MGP) module locates diverse and discriminative parts and focuses context-complementary granularities across different hierarchies within each part. Second, a Cross-Granularity Fusion (CGF) module is developed by fusing granularity features to acquire robust part features for the final classification. We conduct a series of experiments on publicly available datasets i.e., CUB-200-2011, Stanford Cars and FGVC-Aircraft datasets and experimental results demonstrate that the CGFN achieves state-of-the-art performance.","PeriodicalId":276778,"journal":{"name":"2023 4th International Seminar on Artificial Intelligence, Networking and Information Technology (AINIT)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 4th International Seminar on Artificial Intelligence, Networking and Information Technology (AINIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AINIT59027.2023.10212436","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Fine-grained image classification (FGIC) aims to identify subtle visual differences among subcategories, which is challenging due to the small inter-class variances. Existing methods recognize subcategories mainly by locating discriminative parts which exists in the regions with high responses in deep feature maps. However, the regions with high responses in deep feature maps correspond to large receptive fields in the input image, leading to the result that subtle visual differences among subcategories cannot be captured precisely. In this paper we propose a novel Cross-Granularity Fusion Network (CGFN), which excavates subtle yet discriminative granularity features within each part and captures potential interactions among granularity features to build powerful part feature representations. The CGFN consists of two modules: First, the Multi-Granularity Proposal (MGP) module locates diverse and discriminative parts and focuses context-complementary granularities across different hierarchies within each part. Second, a Cross-Granularity Fusion (CGF) module is developed by fusing granularity features to acquire robust part features for the final classification. We conduct a series of experiments on publicly available datasets i.e., CUB-200-2011, Stanford Cars and FGVC-Aircraft datasets and experimental results demonstrate that the CGFN achieves state-of-the-art performance.