Quan Zhou, Yucuan Liang, Zhenqi Zhang, Wenming Cao
{"title":"Vision transformer-based generalized zero-shot learning with data criticizing","authors":"Quan Zhou, Yucuan Liang, Zhenqi Zhang, Wenming Cao","doi":"10.1007/s10489-025-06271-1","DOIUrl":null,"url":null,"abstract":"<div><p>Generalized Zero-Shot Learning (GZSL) aims to enable accurate testing and recognition of unseen classes by utilizing training data from seen classes and leveraging attribute knowledge. However, GZSL faces a challenge wherein the model, trained solely on seen class data, tends to be biased towards recognizing visual features of seen classes, resulting in poorer recognition performance for unseen classes. To address this issue, we propose an approach called <b>Vi</b>sion <b>T</b>ransformer-Based Generalized Zero-Shot Learning with <b>Da</b>ta <b>Cr</b>iticizing (ViT-DaCr). In order to obtain improved visual features, we thoroughly examine features extracted by Vision Transformer (ViT) with a new design. Additionally, we recognize that not all training data align with our model during the training process, leading the model to exhibit a bias towards recognizing visual features of seen classes and directly impacting visual feature recognition. Therefore, we propose a data critic mechanism that utilizes Adjusted Boxplot to filter out such data automatically during the training process. Extensive experiments demonstrate the advanced performance of our model on three challenging and popular datasets.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 6","pages":""},"PeriodicalIF":3.5000,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Intelligence","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10489-025-06271-1","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Generalized Zero-Shot Learning (GZSL) aims to enable accurate testing and recognition of unseen classes by utilizing training data from seen classes and leveraging attribute knowledge. However, GZSL faces a challenge wherein the model, trained solely on seen class data, tends to be biased towards recognizing visual features of seen classes, resulting in poorer recognition performance for unseen classes. To address this issue, we propose an approach called Vision Transformer-Based Generalized Zero-Shot Learning with Data Criticizing (ViT-DaCr). In order to obtain improved visual features, we thoroughly examine features extracted by Vision Transformer (ViT) with a new design. Additionally, we recognize that not all training data align with our model during the training process, leading the model to exhibit a bias towards recognizing visual features of seen classes and directly impacting visual feature recognition. Therefore, we propose a data critic mechanism that utilizes Adjusted Boxplot to filter out such data automatically during the training process. Extensive experiments demonstrate the advanced performance of our model on three challenging and popular datasets.
期刊介绍:
With a focus on research in artificial intelligence and neural networks, this journal addresses issues involving solutions of real-life manufacturing, defense, management, government and industrial problems which are too complex to be solved through conventional approaches and require the simulation of intelligent thought processes, heuristics, applications of knowledge, and distributed and parallel processing. The integration of these multiple approaches in solving complex problems is of particular importance.
The journal presents new and original research and technological developments, addressing real and complex issues applicable to difficult problems. It provides a medium for exchanging scientific research and technological achievements accomplished by the international community.