Gradient-guided channel masking for cross-domain few-shot learning

IF 7.2 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Siqi Hui , Sanping Zhou , Ye Deng , Yang Wu , Jinjun Wang
{"title":"Gradient-guided channel masking for cross-domain few-shot learning","authors":"Siqi Hui ,&nbsp;Sanping Zhou ,&nbsp;Ye Deng ,&nbsp;Yang Wu ,&nbsp;Jinjun Wang","doi":"10.1016/j.knosys.2024.112548","DOIUrl":null,"url":null,"abstract":"<div><div>Cross-Domain Few-Shot Learning (CD-FSL) addresses the Few-Shot Learning with a domain gap between source and target domains, which facilitates the transfer of knowledge from a source domain to a target domain with limited labeled samples. Current approaches often incorporate an auxiliary target dataset containing a few labeled samples to enhance model generalization on specific target domains. However, we observe that many models retain a substantial number of channels that learn source-specific knowledge and extract features that perform adequately on the source domain but generalize poorly to the target domain. This often results in compromised performance due to the influence of source-specific knowledge. To address this challenge, we introduce a novel framework, Gradient-Guided Channel Masking (GGCM), designed for CD-FSL to mitigate model channels from acquiring too much source-specific knowledge. GGCM quantifies each channel’s contribution to solving target tasks using gradients of target loss and identifies those with smaller gradients as source-specific. These channels are then masked during the forward propagation of source features to mitigate the learning of source-specific knowledge. Conversely, GGCM mutes non-source-specific channels during the forward propagation of target features, forcing the model to depend on the source-specific channels and thereby enhancing their generalizability. Moreover, we propose a consistency loss that aligns the predictions made by source-specific channels with those made by the entire model. This approach further enhances the generalizability of these channels by enabling them to learn from the generalizable knowledge contained in other non-source-specific channels. Validated across multiple CD-FSL benchmark datasets, our framework demonstrates state-of-the-art performance and effectively suppresses the learning of source-specific knowledge.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2000,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705124011821","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Cross-Domain Few-Shot Learning (CD-FSL) addresses the Few-Shot Learning with a domain gap between source and target domains, which facilitates the transfer of knowledge from a source domain to a target domain with limited labeled samples. Current approaches often incorporate an auxiliary target dataset containing a few labeled samples to enhance model generalization on specific target domains. However, we observe that many models retain a substantial number of channels that learn source-specific knowledge and extract features that perform adequately on the source domain but generalize poorly to the target domain. This often results in compromised performance due to the influence of source-specific knowledge. To address this challenge, we introduce a novel framework, Gradient-Guided Channel Masking (GGCM), designed for CD-FSL to mitigate model channels from acquiring too much source-specific knowledge. GGCM quantifies each channel’s contribution to solving target tasks using gradients of target loss and identifies those with smaller gradients as source-specific. These channels are then masked during the forward propagation of source features to mitigate the learning of source-specific knowledge. Conversely, GGCM mutes non-source-specific channels during the forward propagation of target features, forcing the model to depend on the source-specific channels and thereby enhancing their generalizability. Moreover, we propose a consistency loss that aligns the predictions made by source-specific channels with those made by the entire model. This approach further enhances the generalizability of these channels by enabling them to learn from the generalizable knowledge contained in other non-source-specific channels. Validated across multiple CD-FSL benchmark datasets, our framework demonstrates state-of-the-art performance and effectively suppresses the learning of source-specific knowledge.
梯度引导的通道掩蔽,用于跨域少量学习
跨域快速学习(Cross-Domain Few-Shot Learning,CD-FSL)解决了源域和目标域之间存在域差距的快速学习(Few-Shot Learning)问题,这有利于将知识从源域转移到标注样本有限的目标域。目前的方法通常会加入一个包含少量标注样本的辅助目标数据集,以增强模型在特定目标领域的泛化能力。然而,我们注意到,许多模型保留了大量学习源特定知识的通道,并提取了在源领域表现良好但在目标领域泛化不佳的特征。由于特定来源知识的影响,这往往会导致性能大打折扣。为了应对这一挑战,我们引入了一个新颖的框架--梯度引导通道屏蔽(GGCM),该框架专为 CD-FSL 设计,以减少模型通道获取过多源特定知识的情况。GGCM 利用目标损失梯度量化每个通道对解决目标任务的贡献,并将梯度较小的通道识别为源特定通道。然后在源特征的前向传播过程中屏蔽这些通道,以减少源特定知识的学习。相反,GGCM 会在目标特征的前向传播过程中屏蔽非源特定通道,迫使模型依赖于源特定通道,从而增强其通用性。此外,我们还提出了一种一致性损失(consistency loss),使特定来源通道的预测与整个模型的预测保持一致。这种方法使这些通道能够从其他非特定源通道中包含的可通用知识中学习,从而进一步增强了这些通道的通用性。经过多个 CD-FSL 基准数据集的验证,我们的框架展示了最先进的性能,并有效抑制了特定来源知识的学习。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Knowledge-Based Systems
Knowledge-Based Systems 工程技术-计算机:人工智能
CiteScore
14.80
自引率
12.50%
发文量
1245
审稿时长
7.8 months
期刊介绍: Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信