了解少子学习中的插曲硬度

Yurong Guo;Ruoyi Du;Aneeshan Sain;Kongming Liang;Yuan Dong;Yi-Zhe Song;Zhanyu Ma
{"title":"了解少子学习中的插曲硬度","authors":"Yurong Guo;Ruoyi Du;Aneeshan Sain;Kongming Liang;Yuan Dong;Yi-Zhe Song;Zhanyu Ma","doi":"10.1109/TPAMI.2024.3476075","DOIUrl":null,"url":null,"abstract":"Achieving generalization for deep learning models has usually suffered from the bottleneck of annotated sample scarcity. As a common way of tackling this issue, few-shot learning focuses on “episodes”, i.e., sampled tasks that help the model acquire generalizable knowledge onto unseen categories – better the episodes, the higher a model's generalisability. Despite extensive research, the characteristics of episodes and their potential effects are relatively less explored. A recent paper discussed that different episodes exhibit different prediction difficulties, and coined a new metric “hardness” to quantify episodes, which however is too wide-range for an arbitrary dataset and thus remains impractical for realistic applications. In this paper therefore, we for the first time conduct an algebraic analysis of the critical factors influencing episode hardness supported by experimental demonstrations, that reveal episode hardness to largely depend on classes within an episode, and importantly propose an efficient pre-sampling hardness assessment technique named Inverse-Fisher Discriminant Ratio (IFDR). This enables sampling hard episodes at the class level via class-level (CL) sampling scheme that drastically decreases quantification cost. Delving deeper, we also develop a variant called class-pair-level (CPL) sampling, which further reduces the sampling cost while guaranteeing the sampled distribution. Finally, comprehensive experiments conducted on benchmark datasets verify the efficacy of our proposed method.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 1","pages":"616-633"},"PeriodicalIF":0.0000,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Understanding Episode Hardness in Few-Shot Learning\",\"authors\":\"Yurong Guo;Ruoyi Du;Aneeshan Sain;Kongming Liang;Yuan Dong;Yi-Zhe Song;Zhanyu Ma\",\"doi\":\"10.1109/TPAMI.2024.3476075\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Achieving generalization for deep learning models has usually suffered from the bottleneck of annotated sample scarcity. As a common way of tackling this issue, few-shot learning focuses on “episodes”, i.e., sampled tasks that help the model acquire generalizable knowledge onto unseen categories – better the episodes, the higher a model's generalisability. Despite extensive research, the characteristics of episodes and their potential effects are relatively less explored. A recent paper discussed that different episodes exhibit different prediction difficulties, and coined a new metric “hardness” to quantify episodes, which however is too wide-range for an arbitrary dataset and thus remains impractical for realistic applications. In this paper therefore, we for the first time conduct an algebraic analysis of the critical factors influencing episode hardness supported by experimental demonstrations, that reveal episode hardness to largely depend on classes within an episode, and importantly propose an efficient pre-sampling hardness assessment technique named Inverse-Fisher Discriminant Ratio (IFDR). This enables sampling hard episodes at the class level via class-level (CL) sampling scheme that drastically decreases quantification cost. Delving deeper, we also develop a variant called class-pair-level (CPL) sampling, which further reduces the sampling cost while guaranteeing the sampled distribution. Finally, comprehensive experiments conducted on benchmark datasets verify the efficacy of our proposed method.\",\"PeriodicalId\":94034,\"journal\":{\"name\":\"IEEE transactions on pattern analysis and machine intelligence\",\"volume\":\"47 1\",\"pages\":\"616-633\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-10-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on pattern analysis and machine intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10707331/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10707331/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

深度学习模型泛化的实现通常受到标注样本稀缺性的瓶颈。作为解决这个问题的一种常见方法,少镜头学习关注于“片段”,即帮助模型获得未见类别的可推广知识的采样任务——片段越好,模型的可推广性越高。尽管进行了广泛的研究,但对发作的特征及其潜在影响的探索相对较少。最近的一篇论文讨论了不同的事件表现出不同的预测困难,并创造了一个新的度量“硬度”来量化事件,然而,对于任意数据集来说,这个范围太大,因此在现实应用中仍然不切实际。因此,在本文中,我们首次通过实验证明对影响片段硬度的关键因素进行了代数分析,揭示了片段硬度在很大程度上取决于片段内的类别,并重要地提出了一种有效的预采样硬度评估技术,称为逆费雪判别比(IFDR)。这使得可以通过类级别(CL)采样方案在类级别对硬集进行采样,从而大大降低了量化成本。深入研究,我们还开发了一种称为类对级(CPL)采样的变体,在保证采样分布的同时进一步降低了采样成本。最后,在基准数据集上进行了全面的实验,验证了本文方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Understanding Episode Hardness in Few-Shot Learning
Achieving generalization for deep learning models has usually suffered from the bottleneck of annotated sample scarcity. As a common way of tackling this issue, few-shot learning focuses on “episodes”, i.e., sampled tasks that help the model acquire generalizable knowledge onto unseen categories – better the episodes, the higher a model's generalisability. Despite extensive research, the characteristics of episodes and their potential effects are relatively less explored. A recent paper discussed that different episodes exhibit different prediction difficulties, and coined a new metric “hardness” to quantify episodes, which however is too wide-range for an arbitrary dataset and thus remains impractical for realistic applications. In this paper therefore, we for the first time conduct an algebraic analysis of the critical factors influencing episode hardness supported by experimental demonstrations, that reveal episode hardness to largely depend on classes within an episode, and importantly propose an efficient pre-sampling hardness assessment technique named Inverse-Fisher Discriminant Ratio (IFDR). This enables sampling hard episodes at the class level via class-level (CL) sampling scheme that drastically decreases quantification cost. Delving deeper, we also develop a variant called class-pair-level (CPL) sampling, which further reduces the sampling cost while guaranteeing the sampled distribution. Finally, comprehensive experiments conducted on benchmark datasets verify the efficacy of our proposed method.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信