{"title":"Semi-supervised Meta-learning for Cross-domain Few-shot Intent Classification","authors":"Judith Yue Li, Jiong Zhang","doi":"10.18653/v1/2021.metanlp-1.8","DOIUrl":null,"url":null,"abstract":"Meta learning aims to optimize the model’s capability to generalize to new tasks and domains. Lacking a data-efficient way to create meta training tasks has prevented the application of meta-learning to the real-world few shot learning scenarios. Recent studies have proposed unsupervised approaches to create meta-training tasks from unlabeled data for free, e.g., the SMLMT method (Bansal et al., 2020a) constructs unsupervised multi-class classification tasks from the unlabeled text by randomly masking words in the sentence and let the meta learner choose which word to fill in the blank. This study proposes a semi-supervised meta-learning approach that incorporates both the representation power of large pre-trained language models and the generalization capability of prototypical networks enhanced by SMLMT. The semi-supervised meta training approach avoids overfitting prototypical networks on a small number of labeled training examples and quickly learns cross-domain task-specific representation only from a few supporting examples. By incorporating SMLMT with prototypical networks, the meta learner generalizes better to unseen domains and gains higher accuracy on out-of-scope examples without the heavy lifting of pre-training. We observe significant improvement in few-shot generalization after training only a few epochs on the intent classification tasks evaluated in a multi-domain setting.","PeriodicalId":171906,"journal":{"name":"Proceedings of the 1st Workshop on Meta Learning and Its Applications to Natural Language Processing","volume":"2 11","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 1st Workshop on Meta Learning and Its Applications to Natural Language Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18653/v1/2021.metanlp-1.8","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9
Abstract
Meta learning aims to optimize the model’s capability to generalize to new tasks and domains. Lacking a data-efficient way to create meta training tasks has prevented the application of meta-learning to the real-world few shot learning scenarios. Recent studies have proposed unsupervised approaches to create meta-training tasks from unlabeled data for free, e.g., the SMLMT method (Bansal et al., 2020a) constructs unsupervised multi-class classification tasks from the unlabeled text by randomly masking words in the sentence and let the meta learner choose which word to fill in the blank. This study proposes a semi-supervised meta-learning approach that incorporates both the representation power of large pre-trained language models and the generalization capability of prototypical networks enhanced by SMLMT. The semi-supervised meta training approach avoids overfitting prototypical networks on a small number of labeled training examples and quickly learns cross-domain task-specific representation only from a few supporting examples. By incorporating SMLMT with prototypical networks, the meta learner generalizes better to unseen domains and gains higher accuracy on out-of-scope examples without the heavy lifting of pre-training. We observe significant improvement in few-shot generalization after training only a few epochs on the intent classification tasks evaluated in a multi-domain setting.
元学习旨在优化模型泛化到新任务和领域的能力。缺乏一种数据高效的方式来创建元训练任务,阻碍了元学习在现实世界中的应用。最近的研究提出了无监督的方法来从未标记的数据中免费创建元训练任务,例如SMLMT方法(Bansal et al., 2020a)通过随机屏蔽句子中的单词,从未标记的文本中构建无监督的多类分类任务,让元学习器选择填充空白的单词。本研究提出了一种半监督元学习方法,该方法结合了大型预训练语言模型的表示能力和SMLMT增强的原型网络的泛化能力。半监督元训练方法避免了在少量标记训练示例上过度拟合原型网络,并且仅从少数支持示例中快速学习跨域任务特定表示。通过将SMLMT与原型网络结合,元学习器可以更好地泛化到未见过的域,并且在没有大量预训练的情况下,在超出范围的示例上获得更高的准确性。我们观察到,在多域设置中评估的意图分类任务上,只训练几个epoch,就可以显著改善少数次泛化。