Image Multi-Label Classification Based on Pyramid Convolution and Split-Attention Mechanism

Yang Xianhua, Yang Yi, Yang Juan, Yao Han, Wang Zheng, Long Shuquan
{"title":"Image Multi-Label Classification Based on Pyramid Convolution and Split-Attention Mechanism","authors":"Yang Xianhua, Yang Yi, Yang Juan, Yao Han, Wang Zheng, Long Shuquan","doi":"10.1109/ICCWAMTIP53232.2021.9674123","DOIUrl":null,"url":null,"abstract":"Image multi-label classification is a critical task in the field of computer vision. The primary difficulty is that multi-label classification relies on the complex information in the image to differentiate different labels, significantly increasing the classification difficulty. We proposed a method for modifying previous models. First, we use TResNet as the benchmark model, replacing ordinary convolution with pyramid convolution in the original model and the attention mechanism in the model with the split-attention method. Then the model was trained on the VOC2007 and MS-COCO data sets. The process of selecting the model's parameters and determining the optimal modification method was demonstrated through comparative experiments. Finally, by comparing the performance of the modified model with the performance of the unmodified model, it is proved that our two modification methods can effectively improve the performance of the model. On the VOC data set, the modified model by the two methods increased by 1% and 1.6%, respectively.","PeriodicalId":358772,"journal":{"name":"2021 18th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP)","volume":"74 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 18th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCWAMTIP53232.2021.9674123","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Image multi-label classification is a critical task in the field of computer vision. The primary difficulty is that multi-label classification relies on the complex information in the image to differentiate different labels, significantly increasing the classification difficulty. We proposed a method for modifying previous models. First, we use TResNet as the benchmark model, replacing ordinary convolution with pyramid convolution in the original model and the attention mechanism in the model with the split-attention method. Then the model was trained on the VOC2007 and MS-COCO data sets. The process of selecting the model's parameters and determining the optimal modification method was demonstrated through comparative experiments. Finally, by comparing the performance of the modified model with the performance of the unmodified model, it is proved that our two modification methods can effectively improve the performance of the model. On the VOC data set, the modified model by the two methods increased by 1% and 1.6%, respectively.
基于金字塔卷积和分散注意机制的图像多标签分类
图像多标签分类是计算机视觉领域的一项关键任务。主要困难是多标签分类依赖于图像中的复杂信息来区分不同的标签,大大增加了分类难度。我们提出了一种修正先前模型的方法。首先,我们以TResNet为基准模型,将原始模型中的普通卷积替换为金字塔卷积,将模型中的注意机制替换为分裂注意方法。然后在VOC2007和MS-COCO数据集上对模型进行训练。通过对比实验,论证了模型参数的选取和最优修正方法的确定过程。最后,通过将修改后的模型与未修改的模型的性能进行比较,证明了我们的两种修改方法都能有效地提高模型的性能。在VOC数据集上,两种方法的修正模型分别提高了1%和1.6%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信