{"title":"Dual adaptive local semantic alignment for few-shot fine-grained classification","authors":"Wei Song, Kaili Yang","doi":"10.1007/s00371-024-03576-z","DOIUrl":null,"url":null,"abstract":"<p>Few-shot fine-grained classification (FS-FGC) aims to learn discriminative semantic details (e.g., beaks and wings) with few labeled samples to precisely recognize novel classes. However, existing feature alignment methods mainly use a support set to align the query sample, which may lead to incorrect alignment of local semantic due to interference from background and non-target objects. In addition, these methods do not take into account the discrepancy of semantic information among channels. To address the above issues, we propose an effective dual adaptive local semantic alignment approach, which is composed of the channel semantic alignment module (CSAM) and the spatial semantic alignment module (SSAM). Specifically, CSAM adaptively generates channel weights to highlight discriminative information based on two sub-modules, namely the class-aware attention module and the target-aware attention module. CAM emphasizes the discriminative semantic details of each category in the support set and TAM enhances the target object region of the query image. On the basis of this, SSAM promotes effective alignment of semantically relevant local regions through a spatial bidirectional alignment strategy. Combining two adaptive modules to better capture fine-grained semantic contextual information along two dimensions, channel and spatial improves the accuracy and robustness of FS-FGC. Experimental results on three widely used fine-grained classification datasets demonstrate excellent performance that has significant competitive advantages over current mainstream methods. Codes are available at: https://github.com/kellyagya/DALSA.</p>","PeriodicalId":501186,"journal":{"name":"The Visual Computer","volume":"24 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Visual Computer","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s00371-024-03576-z","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Few-shot fine-grained classification (FS-FGC) aims to learn discriminative semantic details (e.g., beaks and wings) with few labeled samples to precisely recognize novel classes. However, existing feature alignment methods mainly use a support set to align the query sample, which may lead to incorrect alignment of local semantic due to interference from background and non-target objects. In addition, these methods do not take into account the discrepancy of semantic information among channels. To address the above issues, we propose an effective dual adaptive local semantic alignment approach, which is composed of the channel semantic alignment module (CSAM) and the spatial semantic alignment module (SSAM). Specifically, CSAM adaptively generates channel weights to highlight discriminative information based on two sub-modules, namely the class-aware attention module and the target-aware attention module. CAM emphasizes the discriminative semantic details of each category in the support set and TAM enhances the target object region of the query image. On the basis of this, SSAM promotes effective alignment of semantically relevant local regions through a spatial bidirectional alignment strategy. Combining two adaptive modules to better capture fine-grained semantic contextual information along two dimensions, channel and spatial improves the accuracy and robustness of FS-FGC. Experimental results on three widely used fine-grained classification datasets demonstrate excellent performance that has significant competitive advantages over current mainstream methods. Codes are available at: https://github.com/kellyagya/DALSA.