Enhancing Environmental Robustness in Few-Shot Learning via Conditional Representation Learning

IF 13.7
Qianyu Guo;Jingrong Wu;Tianxing Wu;Haofen Wang;Weifeng Ge;Wenqiang Zhang
{"title":"Enhancing Environmental Robustness in Few-Shot Learning via Conditional Representation Learning","authors":"Qianyu Guo;Jingrong Wu;Tianxing Wu;Haofen Wang;Weifeng Ge;Wenqiang Zhang","doi":"10.1109/TIP.2025.3572762","DOIUrl":null,"url":null,"abstract":"Few-shot learning (FSL) has recently been extensively utilized to overcome the scarcity of training data in domain-specific visual recognition. In real-world scenarios, environmental factors such as complex backgrounds, varying lighting conditions, long-distance shooting, and moving targets often cause test images to exhibit numerous incomplete targets or noise disruptions. However, current research on evaluation datasets and methodologies has largely ignored the concept of “environmental robustness”, which refers to maintaining consistent performance in complex and diverse physical environments. This neglect has led to a notable decline in the performance of FSL models during practical testing compared to their training performance. To bridge this gap, we introduce a new real-world multi-domain few-shot learning (RD-FSL) benchmark, which includes four domains and six evaluation datasets. The test images in this benchmark feature various challenging elements, such as camouflaged objects, small targets, and blurriness. Our evaluation experiments reveal that existing methods struggle to utilize training images effectively to generate accurate feature representations for challenging test images. To address this problem, we propose a novel conditional representation learning network (CRLNet) that integrates the interactions between training and testing images as conditional information in their respective representation processes. The main goal is to reduce intra-class variance or enhance inter-class variance at the feature representation level. Finally, comparative experiments reveal that CRLNet surpasses the current state-of-the-art methods, achieving performance improvements ranging from 6.83% to 16.98% across diverse settings and backbones. The source code and dataset are available at <uri>https://github.com/guoqianyu-alberta/Conditional-Representation-Learning</uri>","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"3489-3502"},"PeriodicalIF":13.7000,"publicationDate":"2025-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/11018198/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Few-shot learning (FSL) has recently been extensively utilized to overcome the scarcity of training data in domain-specific visual recognition. In real-world scenarios, environmental factors such as complex backgrounds, varying lighting conditions, long-distance shooting, and moving targets often cause test images to exhibit numerous incomplete targets or noise disruptions. However, current research on evaluation datasets and methodologies has largely ignored the concept of “environmental robustness”, which refers to maintaining consistent performance in complex and diverse physical environments. This neglect has led to a notable decline in the performance of FSL models during practical testing compared to their training performance. To bridge this gap, we introduce a new real-world multi-domain few-shot learning (RD-FSL) benchmark, which includes four domains and six evaluation datasets. The test images in this benchmark feature various challenging elements, such as camouflaged objects, small targets, and blurriness. Our evaluation experiments reveal that existing methods struggle to utilize training images effectively to generate accurate feature representations for challenging test images. To address this problem, we propose a novel conditional representation learning network (CRLNet) that integrates the interactions between training and testing images as conditional information in their respective representation processes. The main goal is to reduce intra-class variance or enhance inter-class variance at the feature representation level. Finally, comparative experiments reveal that CRLNet surpasses the current state-of-the-art methods, achieving performance improvements ranging from 6.83% to 16.98% across diverse settings and backbones. The source code and dataset are available at https://github.com/guoqianyu-alberta/Conditional-Representation-Learning
利用条件表示学习增强少镜头学习的环境鲁棒性
近年来,在特定领域的视觉识别中,为了克服训练数据的稀缺性,已经广泛地使用了少镜头学习(FSL)。在现实世界的场景中,复杂的背景、多变的照明条件、远距离拍摄和移动目标等环境因素经常会导致测试图像显示许多不完整的目标或噪声中断。然而,目前对评估数据集和方法的研究在很大程度上忽略了“环境鲁棒性”的概念,这是指在复杂多样的物理环境中保持一致的性能。这种忽视导致FSL模型在实际测试中的性能明显下降,而不是训练性能。为了弥补这一差距,我们引入了一个新的现实世界的多领域少镜头学习(RD-FSL)基准,它包括四个领域和六个评估数据集。该基准测试中的测试图像具有各种具有挑战性的元素,例如伪装对象、小目标和模糊性。我们的评估实验表明,现有的方法很难有效地利用训练图像来生成具有挑战性的测试图像的准确特征表示。为了解决这个问题,我们提出了一种新的条件表征学习网络(CRLNet),它将训练图像和测试图像之间的交互作为条件信息集成在各自的表征过程中。主要目标是在特征表示级别上减少类内方差或增强类间方差。最后,对比实验表明,CRLNet超越了当前最先进的方法,在不同设置和主干上实现了6.83%至16.98%的性能改进。源代码和数据集可从https://github.com/guoqianyu-alberta/Conditional-Representation-Learning获得
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信