Semantic guided level-category hybrid prediction network for hierarchical image classification

IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING
Peng Wang, Jingzhou Chen, Yuntao Qian
{"title":"Semantic guided level-category hybrid prediction network for hierarchical image classification","authors":"Peng Wang, Jingzhou Chen, Yuntao Qian","doi":"10.1142/s0219691323500236","DOIUrl":null,"url":null,"abstract":"Hierarchical classification (HC) assigns each object with multiple labels organized into a hierarchical structure. The existing deep learning-based HC methods usually predict an instance starting from the root node until a leaf node is reached. However, in the real world, images impaired by noise, occlusion, blur, or low resolution may not provide sufficient information for the classification at subordinate levels. To address this issue, we propose a novel Semantic Guided level-category Hybrid Prediction Network (SGHPN) that can jointly perform the level and category prediction in an end-to-end manner. SGHPN comprises two modules: a visual transformer that extracts feature vectors from the input images, and a semantic guided cross-attention module that uses categories word embeddings as queries to guide learning category-specific representations. In order to evaluate the proposed method, we construct two new datasets in which images are at a broad range of quality and thus are labeled to different levels (depths) in the hierarchy according to their individual quality. Experimental results demonstrate the effectiveness of our proposed HC method.","PeriodicalId":50282,"journal":{"name":"International Journal of Wavelets Multiresolution and Information Processing","volume":"77 1","pages":"0"},"PeriodicalIF":0.9000,"publicationDate":"2023-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Wavelets Multiresolution and Information Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1142/s0219691323500236","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0

Abstract

Hierarchical classification (HC) assigns each object with multiple labels organized into a hierarchical structure. The existing deep learning-based HC methods usually predict an instance starting from the root node until a leaf node is reached. However, in the real world, images impaired by noise, occlusion, blur, or low resolution may not provide sufficient information for the classification at subordinate levels. To address this issue, we propose a novel Semantic Guided level-category Hybrid Prediction Network (SGHPN) that can jointly perform the level and category prediction in an end-to-end manner. SGHPN comprises two modules: a visual transformer that extracts feature vectors from the input images, and a semantic guided cross-attention module that uses categories word embeddings as queries to guide learning category-specific representations. In order to evaluate the proposed method, we construct two new datasets in which images are at a broad range of quality and thus are labeled to different levels (depths) in the hierarchy according to their individual quality. Experimental results demonstrate the effectiveness of our proposed HC method.
语义引导的分层图像分类层次-类别混合预测网络
层次分类(HC)为每个对象分配多个标签,这些标签组织成层次结构。现有的基于深度学习的HC方法通常从根节点开始预测实例,直到到达叶节点。然而,在现实世界中,受噪声、遮挡、模糊或低分辨率影响的图像可能无法为下级分类提供足够的信息。为了解决这一问题,我们提出了一种新的语义引导水平-类别混合预测网络(SGHPN),该网络可以以端到端方式联合执行水平和类别预测。SGHPN包括两个模块:一个是从输入图像中提取特征向量的视觉转换器,以及一个语义引导的交叉注意模块,该模块使用类别词嵌入作为查询来指导学习特定类别的表示。为了评估所提出的方法,我们构建了两个新的数据集,其中图像的质量范围很广,因此根据它们的个人质量在层次结构中被标记为不同的层次(深度)。实验结果证明了该方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
2.60
自引率
7.10%
发文量
52
审稿时长
2.7 months
期刊介绍: International Journal of Wavelets, Multiresolution and Information Processing (hereafter referred to as IJWMIP) is a bi-monthly publication for theoretical and applied papers on the current state-of-the-art results of wavelet analysis, multiresolution and information processing. Papers related to the IJWMIP theme are especially solicited, including theories, methodologies, algorithms and emerging applications. Topics of interest of the IJWMIP include, but are not limited to: 1. Wavelets: Wavelets and operator theory Frame and applications Time-frequency analysis and applications Sparse representation and approximation Sampling theory and compressive sensing Wavelet based algorithms and applications 2. Multiresolution: Multiresolution analysis Multiscale approximation Multiresolution image processing and signal processing Multiresolution representations Deep learning and neural networks Machine learning theory, algorithms and applications High dimensional data analysis 3. Information Processing: Data sciences Big data and applications Information theory Information systems and technology Information security Information learning and processing Artificial intelligence and pattern recognition Image/signal processing.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信