Enhancing Environmental Robustness in Few-Shot Learning via Conditional Representation Learning

IF 13.7

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society Pub Date : 2025-06-02 DOI:10.1109/TIP.2025.3572762

Qianyu Guo;Jingrong Wu;Tianxing Wu;Haofen Wang;Weifeng Ge;Wenqiang Zhang

{"title":"Enhancing Environmental Robustness in Few-Shot Learning via Conditional Representation Learning","authors":"Qianyu Guo;Jingrong Wu;Tianxing Wu;Haofen Wang;Weifeng Ge;Wenqiang Zhang","doi":"10.1109/TIP.2025.3572762","DOIUrl":null,"url":null,"abstract":"Few-shot learning (FSL) has recently been extensively utilized to overcome the scarcity of training data in domain-specific visual recognition. In real-world scenarios, environmental factors such as complex backgrounds, varying lighting conditions, long-distance shooting, and moving targets often cause test images to exhibit numerous incomplete targets or noise disruptions. However, current research on evaluation datasets and methodologies has largely ignored the concept of “environmental robustness”, which refers to maintaining consistent performance in complex and diverse physical environments. This neglect has led to a notable decline in the performance of FSL models during practical testing compared to their training performance. To bridge this gap, we introduce a new real-world multi-domain few-shot learning (RD-FSL) benchmark, which includes four domains and six evaluation datasets. The test images in this benchmark feature various challenging elements, such as camouflaged objects, small targets, and blurriness. Our evaluation experiments reveal that existing methods struggle to utilize training images effectively to generate accurate feature representations for challenging test images. To address this problem, we propose a novel conditional representation learning network (CRLNet) that integrates the interactions between training and testing images as conditional information in their respective representation processes. The main goal is to reduce intra-class variance or enhance inter-class variance at the feature representation level. Finally, comparative experiments reveal that CRLNet surpasses the current state-of-the-art methods, achieving performance improvements ranging from 6.83% to 16.98% across diverse settings and backbones. The source code and dataset are available at <uri>https://github.com/guoqianyu-alberta/Conditional-Representation-Learning</uri>","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"3489-3502"},"PeriodicalIF":13.7000,"publicationDate":"2025-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/11018198/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Few-shot learning (FSL) has recently been extensively utilized to overcome the scarcity of training data in domain-specific visual recognition. In real-world scenarios, environmental factors such as complex backgrounds, varying lighting conditions, long-distance shooting, and moving targets often cause test images to exhibit numerous incomplete targets or noise disruptions. However, current research on evaluation datasets and methodologies has largely ignored the concept of “environmental robustness”, which refers to maintaining consistent performance in complex and diverse physical environments. This neglect has led to a notable decline in the performance of FSL models during practical testing compared to their training performance. To bridge this gap, we introduce a new real-world multi-domain few-shot learning (RD-FSL) benchmark, which includes four domains and six evaluation datasets. The test images in this benchmark feature various challenging elements, such as camouflaged objects, small targets, and blurriness. Our evaluation experiments reveal that existing methods struggle to utilize training images effectively to generate accurate feature representations for challenging test images. To address this problem, we propose a novel conditional representation learning network (CRLNet) that integrates the interactions between training and testing images as conditional information in their respective representation processes. The main goal is to reduce intra-class variance or enhance inter-class variance at the feature representation level. Finally, comparative experiments reveal that CRLNet surpasses the current state-of-the-art methods, achieving performance improvements ranging from 6.83% to 16.98% across diverse settings and backbones. The source code and dataset are available at https://github.com/guoqianyu-alberta/Conditional-Representation-Learning

查看原文本刊更多论文

利用条件表示学习增强少镜头学习的环境鲁棒性

近年来，在特定领域的视觉识别中，为了克服训练数据的稀缺性，已经广泛地使用了少镜头学习（FSL）。在现实世界的场景中，复杂的背景、多变的照明条件、远距离拍摄和移动目标等环境因素经常会导致测试图像显示许多不完整的目标或噪声中断。然而，目前对评估数据集和方法的研究在很大程度上忽略了“环境鲁棒性”的概念，这是指在复杂多样的物理环境中保持一致的性能。这种忽视导致FSL模型在实际测试中的性能明显下降，而不是训练性能。为了弥补这一差距，我们引入了一个新的现实世界的多领域少镜头学习（RD-FSL）基准，它包括四个领域和六个评估数据集。该基准测试中的测试图像具有各种具有挑战性的元素，例如伪装对象、小目标和模糊性。我们的评估实验表明，现有的方法很难有效地利用训练图像来生成具有挑战性的测试图像的准确特征表示。为了解决这个问题，我们提出了一种新的条件表征学习网络（CRLNet），它将训练图像和测试图像之间的交互作为条件信息集成在各自的表征过程中。主要目标是在特征表示级别上减少类内方差或增强类间方差。最后，对比实验表明，CRLNet超越了当前最先进的方法，在不同设置和主干上实现了6.83%至16.98%的性能改进。源代码和数据集可从https://github.com/guoqianyu-alberta/Conditional-Representation-Learning获得

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society

自引率

0.00%

发文量