Qin Xu , Sitong Li , Jiahui Wang , Bo Jiang , Bin Luo , Jinhui Tang
{"title":"用于细粒度视觉分类的上下文语义质量感知网络","authors":"Qin Xu , Sitong Li , Jiahui Wang , Bo Jiang , Bin Luo , Jinhui Tang","doi":"10.1016/j.patcog.2025.112033","DOIUrl":null,"url":null,"abstract":"<div><div>Exploring and mining subtle yet distinctive features between sub-categories with similar appearances is crucial for fine-grained visual categorization (FGVC). However, the existing FGVC methods cannot mine discriminative features from low-quality samples, leading to a significant decline in performance. To address this issue, we propose a weakly supervised Context-Semantic Quality Awareness Network (CSQA-Net) for FGVC. Specifically, to assess and enhance the quality of multi-granularity visual representations, we propose the Multi-level Semantic Quality Evaluation (MSQE) module, composed of the Quality Probing (QP) classifier. To alleviate the scale confusion problems and accurately identify the local distinctive regions, the part navigator is developed. Moreover, the Multi-part and Multi-scale Cross-Attention (MMCA) module is designed to model the spatial contextual relationship between rich part descriptors and global semantics, thus capturing more discriminative details within the object. Finally, the context-aware features from MMCA and semantically enhanced features from MSQE are fed into the corresponding QP classifiers to evaluate the quality in real time, further boosting the discriminability. Comprehensive experiments on four popular and highly competitive datasets demonstrate the superiority of the proposed CSQA-Net in comparison with the state-of-the-art methods. Code is available at <span><span>https://github.com/zmisiter/CSQA-Net</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"170 ","pages":"Article 112033"},"PeriodicalIF":7.6000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Context-Semantic Quality Awareness Network for fine-grained visual categorization\",\"authors\":\"Qin Xu , Sitong Li , Jiahui Wang , Bo Jiang , Bin Luo , Jinhui Tang\",\"doi\":\"10.1016/j.patcog.2025.112033\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Exploring and mining subtle yet distinctive features between sub-categories with similar appearances is crucial for fine-grained visual categorization (FGVC). However, the existing FGVC methods cannot mine discriminative features from low-quality samples, leading to a significant decline in performance. To address this issue, we propose a weakly supervised Context-Semantic Quality Awareness Network (CSQA-Net) for FGVC. Specifically, to assess and enhance the quality of multi-granularity visual representations, we propose the Multi-level Semantic Quality Evaluation (MSQE) module, composed of the Quality Probing (QP) classifier. To alleviate the scale confusion problems and accurately identify the local distinctive regions, the part navigator is developed. Moreover, the Multi-part and Multi-scale Cross-Attention (MMCA) module is designed to model the spatial contextual relationship between rich part descriptors and global semantics, thus capturing more discriminative details within the object. Finally, the context-aware features from MMCA and semantically enhanced features from MSQE are fed into the corresponding QP classifiers to evaluate the quality in real time, further boosting the discriminability. Comprehensive experiments on four popular and highly competitive datasets demonstrate the superiority of the proposed CSQA-Net in comparison with the state-of-the-art methods. Code is available at <span><span>https://github.com/zmisiter/CSQA-Net</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":49713,\"journal\":{\"name\":\"Pattern Recognition\",\"volume\":\"170 \",\"pages\":\"Article 112033\"},\"PeriodicalIF\":7.6000,\"publicationDate\":\"2025-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Pattern Recognition\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0031320325006934\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320325006934","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
摘要
探索和挖掘具有相似外观的子类别之间微妙而独特的特征对于细粒度视觉分类(FGVC)至关重要。然而,现有的FGVC方法不能从低质量样本中挖掘判别特征,导致性能显著下降。为了解决这个问题,我们提出了一个弱监督的上下文语义质量感知网络(CSQA-Net)。具体来说,为了评估和提高多粒度视觉表示的质量,我们提出了多层次语义质量评估(MSQE)模块,该模块由质量探测(QP)分类器组成。为了缓解尺度混淆问题,准确识别局部特色区域,开发了局部导航器。此外,设计了多部分多尺度交叉注意(Multi-part and Multi-scale Cross-Attention, MMCA)模块,对丰富的部分描述符和全局语义之间的空间上下文关系进行建模,从而捕获对象内部更多的判别细节。最后,将来自MMCA的上下文感知特征和来自MSQE的语义增强特征馈送到相应的QP分类器中,实时评估质量,进一步提高可判别性。在四个流行且竞争激烈的数据集上进行的综合实验表明,与最先进的方法相比,所提出的CSQA-Net具有优越性。代码可从https://github.com/zmisiter/CSQA-Net获得。
Context-Semantic Quality Awareness Network for fine-grained visual categorization
Exploring and mining subtle yet distinctive features between sub-categories with similar appearances is crucial for fine-grained visual categorization (FGVC). However, the existing FGVC methods cannot mine discriminative features from low-quality samples, leading to a significant decline in performance. To address this issue, we propose a weakly supervised Context-Semantic Quality Awareness Network (CSQA-Net) for FGVC. Specifically, to assess and enhance the quality of multi-granularity visual representations, we propose the Multi-level Semantic Quality Evaluation (MSQE) module, composed of the Quality Probing (QP) classifier. To alleviate the scale confusion problems and accurately identify the local distinctive regions, the part navigator is developed. Moreover, the Multi-part and Multi-scale Cross-Attention (MMCA) module is designed to model the spatial contextual relationship between rich part descriptors and global semantics, thus capturing more discriminative details within the object. Finally, the context-aware features from MMCA and semantically enhanced features from MSQE are fed into the corresponding QP classifiers to evaluate the quality in real time, further boosting the discriminability. Comprehensive experiments on four popular and highly competitive datasets demonstrate the superiority of the proposed CSQA-Net in comparison with the state-of-the-art methods. Code is available at https://github.com/zmisiter/CSQA-Net.
期刊介绍:
The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.