{"title":"Multi-branch selection fusion fine-grained classification algorithm based on coordinate attention localization","authors":"Feng Zhang, Gaocai Wang, Man Wu, Shuqiang Huang","doi":"10.3233/aic-220187","DOIUrl":null,"url":null,"abstract":"Object localization has been the focus of research in Fine-Grained Visual Categorization (FGVC). With the aim of improving the accuracy and precision of object localization in multi-branch networks, as well as the robustness and universality of object localization methods, our study mainly focus on how to combine coordinate attention and feature activation map for target localization. The model in this paper is a three-branch model including raw branch, object branch and part branch. The images are fed directly into the raw branch. Coordinate Attention Object Localization Module (CAOLM) is used to localize and crop objects in the image to generate the input for the object branch. Attention Partial Proposal Module (APPM) is used to propose part regions at different scales. The three classes of input images undergo end-to-end weakly supervised learning through different branches of the network. The model expands the receptive field to capture multi-scale features by Selective Branch Atrous Spatial Pooling Pyramid (SB-ASPP). It can fuse the feature maps obtained from the raw branch and the object branch with Selective Branch Block (SBBlock), and the complete features of the raw branch are used to supplement the missing information of the object branch. Extensive experimental results on CUB-200-2011, FGVC-Aircraft and Stanford Cars datasets show that our method has the best classification performance on FGVC-Aircraft and also has competitive performance on other datasets. Few parameters and fast inference speed are also the advantages of our model.","PeriodicalId":50835,"journal":{"name":"AI Communications","volume":"67 1","pages":"0"},"PeriodicalIF":1.4000,"publicationDate":"2023-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"AI Communications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/aic-220187","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Object localization has been the focus of research in Fine-Grained Visual Categorization (FGVC). With the aim of improving the accuracy and precision of object localization in multi-branch networks, as well as the robustness and universality of object localization methods, our study mainly focus on how to combine coordinate attention and feature activation map for target localization. The model in this paper is a three-branch model including raw branch, object branch and part branch. The images are fed directly into the raw branch. Coordinate Attention Object Localization Module (CAOLM) is used to localize and crop objects in the image to generate the input for the object branch. Attention Partial Proposal Module (APPM) is used to propose part regions at different scales. The three classes of input images undergo end-to-end weakly supervised learning through different branches of the network. The model expands the receptive field to capture multi-scale features by Selective Branch Atrous Spatial Pooling Pyramid (SB-ASPP). It can fuse the feature maps obtained from the raw branch and the object branch with Selective Branch Block (SBBlock), and the complete features of the raw branch are used to supplement the missing information of the object branch. Extensive experimental results on CUB-200-2011, FGVC-Aircraft and Stanford Cars datasets show that our method has the best classification performance on FGVC-Aircraft and also has competitive performance on other datasets. Few parameters and fast inference speed are also the advantages of our model.
期刊介绍:
AI Communications is a journal on artificial intelligence (AI) which has a close relationship to EurAI (European Association for Artificial Intelligence, formerly ECCAI). It covers the whole AI community: Scientific institutions as well as commercial and industrial companies.
AI Communications aims to enhance contacts and information exchange between AI researchers and developers, and to provide supranational information to those concerned with AI and advanced information processing. AI Communications publishes refereed articles concerning scientific and technical AI procedures, provided they are of sufficient interest to a large readership of both scientific and practical background. In addition it contains high-level background material, both at the technical level as well as the level of opinions, policies and news.