Siqi Hui , Sanping Zhou , Ye Deng , Yang Wu , Jinjun Wang
{"title":"Gradient-guided channel masking for cross-domain few-shot learning","authors":"Siqi Hui , Sanping Zhou , Ye Deng , Yang Wu , Jinjun Wang","doi":"10.1016/j.knosys.2024.112548","DOIUrl":null,"url":null,"abstract":"<div><div>Cross-Domain Few-Shot Learning (CD-FSL) addresses the Few-Shot Learning with a domain gap between source and target domains, which facilitates the transfer of knowledge from a source domain to a target domain with limited labeled samples. Current approaches often incorporate an auxiliary target dataset containing a few labeled samples to enhance model generalization on specific target domains. However, we observe that many models retain a substantial number of channels that learn source-specific knowledge and extract features that perform adequately on the source domain but generalize poorly to the target domain. This often results in compromised performance due to the influence of source-specific knowledge. To address this challenge, we introduce a novel framework, Gradient-Guided Channel Masking (GGCM), designed for CD-FSL to mitigate model channels from acquiring too much source-specific knowledge. GGCM quantifies each channel’s contribution to solving target tasks using gradients of target loss and identifies those with smaller gradients as source-specific. These channels are then masked during the forward propagation of source features to mitigate the learning of source-specific knowledge. Conversely, GGCM mutes non-source-specific channels during the forward propagation of target features, forcing the model to depend on the source-specific channels and thereby enhancing their generalizability. Moreover, we propose a consistency loss that aligns the predictions made by source-specific channels with those made by the entire model. This approach further enhances the generalizability of these channels by enabling them to learn from the generalizable knowledge contained in other non-source-specific channels. Validated across multiple CD-FSL benchmark datasets, our framework demonstrates state-of-the-art performance and effectively suppresses the learning of source-specific knowledge.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2000,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705124011821","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Cross-Domain Few-Shot Learning (CD-FSL) addresses the Few-Shot Learning with a domain gap between source and target domains, which facilitates the transfer of knowledge from a source domain to a target domain with limited labeled samples. Current approaches often incorporate an auxiliary target dataset containing a few labeled samples to enhance model generalization on specific target domains. However, we observe that many models retain a substantial number of channels that learn source-specific knowledge and extract features that perform adequately on the source domain but generalize poorly to the target domain. This often results in compromised performance due to the influence of source-specific knowledge. To address this challenge, we introduce a novel framework, Gradient-Guided Channel Masking (GGCM), designed for CD-FSL to mitigate model channels from acquiring too much source-specific knowledge. GGCM quantifies each channel’s contribution to solving target tasks using gradients of target loss and identifies those with smaller gradients as source-specific. These channels are then masked during the forward propagation of source features to mitigate the learning of source-specific knowledge. Conversely, GGCM mutes non-source-specific channels during the forward propagation of target features, forcing the model to depend on the source-specific channels and thereby enhancing their generalizability. Moreover, we propose a consistency loss that aligns the predictions made by source-specific channels with those made by the entire model. This approach further enhances the generalizability of these channels by enabling them to learn from the generalizable knowledge contained in other non-source-specific channels. Validated across multiple CD-FSL benchmark datasets, our framework demonstrates state-of-the-art performance and effectively suppresses the learning of source-specific knowledge.
期刊介绍:
Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.