面向细粒度分类的跨层自关注学习网络

2023 3rd International Conference on Consumer Electronics and Computer Engineering (ICCECE) Pub Date : 2023-01-06 DOI:10.1109/ICCECE58074.2023.10135230

Jianhua Chen, Songsen Yu, Junle Liang

{"title":"面向细粒度分类的跨层自关注学习网络","authors":"Jianhua Chen, Songsen Yu, Junle Liang","doi":"10.1109/ICCECE58074.2023.10135230","DOIUrl":null,"url":null,"abstract":"Fine-grained image classification refers to the more fine-grained sub-categories division based on the basic categories that have been divided. It has become a very challenging research task, due to the characteristics of data with large inter-class differences and small intra-class differences. This paper proposes a cross-layer self-attention (CS) network for learning refined discriminative image features across layers. The network consists of a backbone and a cross-layer self-attention module including three submodules, i.e., cross-layer channel attention, cross-layer space attention and feature fusion submodules. Cross-layer channel attention module can bring a channel self-attention by interacting information between low-layer and high-layer in convolutional networks and then load the channel self-attention into low-level to obtain finer low-level features. Cross-layer spatial attention module has similar effect and can obtain finer low level features in the spatial dimension. The feature fusion module fuses low-level features with high-level features where low-level features can be obtained through combining channel and spatial features. The experiments on three benchmark datasets show that the network based on backbone ResNet101 outperform the most mainstream models on the classification accuracy.","PeriodicalId":120030,"journal":{"name":"2023 3rd International Conference on Consumer Electronics and Computer Engineering (ICCECE)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Cross-layer Self-attention Learning Network for Fine-grained Classification\",\"authors\":\"Jianhua Chen, Songsen Yu, Junle Liang\",\"doi\":\"10.1109/ICCECE58074.2023.10135230\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Fine-grained image classification refers to the more fine-grained sub-categories division based on the basic categories that have been divided. It has become a very challenging research task, due to the characteristics of data with large inter-class differences and small intra-class differences. This paper proposes a cross-layer self-attention (CS) network for learning refined discriminative image features across layers. The network consists of a backbone and a cross-layer self-attention module including three submodules, i.e., cross-layer channel attention, cross-layer space attention and feature fusion submodules. Cross-layer channel attention module can bring a channel self-attention by interacting information between low-layer and high-layer in convolutional networks and then load the channel self-attention into low-level to obtain finer low-level features. Cross-layer spatial attention module has similar effect and can obtain finer low level features in the spatial dimension. The feature fusion module fuses low-level features with high-level features where low-level features can be obtained through combining channel and spatial features. The experiments on three benchmark datasets show that the network based on backbone ResNet101 outperform the most mainstream models on the classification accuracy.\",\"PeriodicalId\":120030,\"journal\":{\"name\":\"2023 3rd International Conference on Consumer Electronics and Computer Engineering (ICCECE)\",\"volume\":\"27 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 3rd International Conference on Consumer Electronics and Computer Engineering (ICCECE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCECE58074.2023.10135230\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 3rd International Conference on Consumer Electronics and Computer Engineering (ICCECE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCECE58074.2023.10135230","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

细粒度图像分类是指在已经划分的基本类别的基础上进行更细粒度的子类别划分。由于数据具有班级间差异大，班级内差异小的特点，这已经成为一项非常具有挑战性的研究任务。本文提出了一种跨层自注意(CS)网络，用于跨层学习精细的判别图像特征。该网络由主干网和跨层自关注模块组成，其中包括跨层通道关注、跨层空间关注和特征融合三个子模块。跨层通道注意模块通过卷积网络中低层与高层信息的交互，产生通道自注意，再将通道自注意加载到底层，获得更精细的底层特征。跨层空间注意模块具有类似的效果，可以在空间维度上获得更精细的低层特征。特征融合模块将低级特征与高级特征进行融合，通过通道特征与空间特征相结合得到低级特征。在三个基准数据集上的实验表明，基于骨干ResNet101的网络在分类精度上优于大多数主流模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Cross-layer Self-attention Learning Network for Fine-grained Classification

Fine-grained image classification refers to the more fine-grained sub-categories division based on the basic categories that have been divided. It has become a very challenging research task, due to the characteristics of data with large inter-class differences and small intra-class differences. This paper proposes a cross-layer self-attention (CS) network for learning refined discriminative image features across layers. The network consists of a backbone and a cross-layer self-attention module including three submodules, i.e., cross-layer channel attention, cross-layer space attention and feature fusion submodules. Cross-layer channel attention module can bring a channel self-attention by interacting information between low-layer and high-layer in convolutional networks and then load the channel self-attention into low-level to obtain finer low-level features. Cross-layer spatial attention module has similar effect and can obtain finer low level features in the spatial dimension. The feature fusion module fuses low-level features with high-level features where low-level features can be obtained through combining channel and spatial features. The experiments on three benchmark datasets show that the network based on backbone ResNet101 outperform the most mainstream models on the classification accuracy.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2023 3rd International Conference on Consumer Electronics and Computer Engineering (ICCECE)

自引率

0.00%

发文量