On Measuring the Intrinsic Few-Shot Hardness of Datasets

Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing Pub Date : 2022-11-16 DOI:10.48550/arXiv.2211.09113

Xinran Zhao, Shikhar Murty, Christopher D. Manning

{"title":"On Measuring the Intrinsic Few-Shot Hardness of Datasets","authors":"Xinran Zhao, Shikhar Murty, Christopher D. Manning","doi":"10.48550/arXiv.2211.09113","DOIUrl":null,"url":null,"abstract":"While advances in pre-training have led to dramatic improvements in few-shot learning of NLP tasks, there is limited understanding of what drives successful few-shot adaptation in datasets. In particular, given a new dataset and a pre-trained model, what properties of the dataset make it few-shot learnable, and are these properties independent of the specific adaptation techniques used? We consider an extensive set of recent few-shot learning methods and show that their performance across a large number of datasets is highly correlated, showing that few-shot hardness may be intrinsic to datasets, for a given pre-trained model. To estimate intrinsic few-shot hardness, we then propose a simple and lightweight metric called Spread that captures the intuition that few-shot learning is made possible by exploiting feature-space invariances between training and test samples. Our metric better accounts for few-shot hardness compared to existing notions of hardness and is ~8-100x faster to compute.","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"34 1","pages":"3955-3963"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2211.09113","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

While advances in pre-training have led to dramatic improvements in few-shot learning of NLP tasks, there is limited understanding of what drives successful few-shot adaptation in datasets. In particular, given a new dataset and a pre-trained model, what properties of the dataset make it few-shot learnable, and are these properties independent of the specific adaptation techniques used? We consider an extensive set of recent few-shot learning methods and show that their performance across a large number of datasets is highly correlated, showing that few-shot hardness may be intrinsic to datasets, for a given pre-trained model. To estimate intrinsic few-shot hardness, we then propose a simple and lightweight metric called Spread that captures the intuition that few-shot learning is made possible by exploiting feature-space invariances between training and test samples. Our metric better accounts for few-shot hardness compared to existing notions of hardness and is ~8-100x faster to compute.

查看原文本刊更多论文

关于数据集固有少射硬度的测量

虽然预训练的进步导致了NLP任务的少量学习的显着改善，但对驱动数据集中成功的少量适应的理解有限。特别是，给定一个新的数据集和一个预训练的模型，数据集的哪些属性使它可以少量学习，这些属性是否独立于所使用的特定适应技术?我们考虑了一组广泛的最近的少量射击学习方法，并表明它们在大量数据集上的表现是高度相关的，这表明对于给定的预训练模型，少量射击硬度可能是数据集固有的。为了估计固有的少射硬度，我们提出了一个简单而轻量级的度量，称为Spread，它通过利用训练样本和测试样本之间的特征空间不变性来实现少射学习。与现有的硬度概念相比，我们的度量更好地说明了少量硬度，并且计算速度快了8-100倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing

自引率

0.00%

发文量