面向细粒度分类的跨层自关注学习网络

Jianhua Chen, Songsen Yu, Junle Liang
{"title":"面向细粒度分类的跨层自关注学习网络","authors":"Jianhua Chen, Songsen Yu, Junle Liang","doi":"10.1109/ICCECE58074.2023.10135230","DOIUrl":null,"url":null,"abstract":"Fine-grained image classification refers to the more fine-grained sub-categories division based on the basic categories that have been divided. It has become a very challenging research task, due to the characteristics of data with large inter-class differences and small intra-class differences. This paper proposes a cross-layer self-attention (CS) network for learning refined discriminative image features across layers. The network consists of a backbone and a cross-layer self-attention module including three submodules, i.e., cross-layer channel attention, cross-layer space attention and feature fusion submodules. Cross-layer channel attention module can bring a channel self-attention by interacting information between low-layer and high-layer in convolutional networks and then load the channel self-attention into low-level to obtain finer low-level features. Cross-layer spatial attention module has similar effect and can obtain finer low level features in the spatial dimension. The feature fusion module fuses low-level features with high-level features where low-level features can be obtained through combining channel and spatial features. The experiments on three benchmark datasets show that the network based on backbone ResNet101 outperform the most mainstream models on the classification accuracy.","PeriodicalId":120030,"journal":{"name":"2023 3rd International Conference on Consumer Electronics and Computer Engineering (ICCECE)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Cross-layer Self-attention Learning Network for Fine-grained Classification\",\"authors\":\"Jianhua Chen, Songsen Yu, Junle Liang\",\"doi\":\"10.1109/ICCECE58074.2023.10135230\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Fine-grained image classification refers to the more fine-grained sub-categories division based on the basic categories that have been divided. It has become a very challenging research task, due to the characteristics of data with large inter-class differences and small intra-class differences. This paper proposes a cross-layer self-attention (CS) network for learning refined discriminative image features across layers. The network consists of a backbone and a cross-layer self-attention module including three submodules, i.e., cross-layer channel attention, cross-layer space attention and feature fusion submodules. Cross-layer channel attention module can bring a channel self-attention by interacting information between low-layer and high-layer in convolutional networks and then load the channel self-attention into low-level to obtain finer low-level features. Cross-layer spatial attention module has similar effect and can obtain finer low level features in the spatial dimension. The feature fusion module fuses low-level features with high-level features where low-level features can be obtained through combining channel and spatial features. The experiments on three benchmark datasets show that the network based on backbone ResNet101 outperform the most mainstream models on the classification accuracy.\",\"PeriodicalId\":120030,\"journal\":{\"name\":\"2023 3rd International Conference on Consumer Electronics and Computer Engineering (ICCECE)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 3rd International Conference on Consumer Electronics and Computer Engineering (ICCECE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCECE58074.2023.10135230\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 3rd International Conference on Consumer Electronics and Computer Engineering (ICCECE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCECE58074.2023.10135230","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

细粒度图像分类是指在已经划分的基本类别的基础上进行更细粒度的子类别划分。由于数据具有班级间差异大,班级内差异小的特点,这已经成为一项非常具有挑战性的研究任务。本文提出了一种跨层自注意(CS)网络,用于跨层学习精细的判别图像特征。该网络由主干网和跨层自关注模块组成,其中包括跨层通道关注、跨层空间关注和特征融合三个子模块。跨层通道注意模块通过卷积网络中低层与高层信息的交互,产生通道自注意,再将通道自注意加载到底层,获得更精细的底层特征。跨层空间注意模块具有类似的效果,可以在空间维度上获得更精细的低层特征。特征融合模块将低级特征与高级特征进行融合,通过通道特征与空间特征相结合得到低级特征。在三个基准数据集上的实验表明,基于骨干ResNet101的网络在分类精度上优于大多数主流模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A Cross-layer Self-attention Learning Network for Fine-grained Classification
Fine-grained image classification refers to the more fine-grained sub-categories division based on the basic categories that have been divided. It has become a very challenging research task, due to the characteristics of data with large inter-class differences and small intra-class differences. This paper proposes a cross-layer self-attention (CS) network for learning refined discriminative image features across layers. The network consists of a backbone and a cross-layer self-attention module including three submodules, i.e., cross-layer channel attention, cross-layer space attention and feature fusion submodules. Cross-layer channel attention module can bring a channel self-attention by interacting information between low-layer and high-layer in convolutional networks and then load the channel self-attention into low-level to obtain finer low-level features. Cross-layer spatial attention module has similar effect and can obtain finer low level features in the spatial dimension. The feature fusion module fuses low-level features with high-level features where low-level features can be obtained through combining channel and spatial features. The experiments on three benchmark datasets show that the network based on backbone ResNet101 outperform the most mainstream models on the classification accuracy.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信