越少越好:恢复意图特征子空间以鲁棒化NLU模型

Ting Wu, Tao Gui
{"title":"越少越好:恢复意图特征子空间以鲁棒化NLU模型","authors":"Ting Wu, Tao Gui","doi":"10.48550/arXiv.2209.07879","DOIUrl":null,"url":null,"abstract":"Datasets with significant proportions of bias present threats for training a trustworthy model on NLU tasks. Despite yielding great progress, current debiasing methods impose excessive reliance on the knowledge of bias attributes. Definition of the attributes, however, is elusive and varies across different datasets. In addition, leveraging these attributes at input level to bias mitigation may leave a gap between intrinsic properties and the underlying decision rule. To narrow down this gap and liberate the supervision on bias, we suggest extending bias mitigation into feature space. Therefore, a novel model, Recovering Intended-Feature Subspace with Knowledge-Free (RISK) is developed. Assuming that shortcut features caused by various biases are unintended for prediction, RISK views them as redundant features. When delving into a lower manifold to remove redundancies, RISK reveals that an extremely low-dimensional subspace with intended features can robustly represent the highly biased dataset. Empirical results demonstrate our model can consistently improve model generalization to out-of-distribution set, and achieves a new state-of-the-art performance.","PeriodicalId":91381,"journal":{"name":"Proceedings of COLING. International Conference on Computational Linguistics","volume":"27 1","pages":"1666-1676"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Less Is Better: Recovering Intended-Feature Subspace to Robustify NLU Models\",\"authors\":\"Ting Wu, Tao Gui\",\"doi\":\"10.48550/arXiv.2209.07879\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Datasets with significant proportions of bias present threats for training a trustworthy model on NLU tasks. Despite yielding great progress, current debiasing methods impose excessive reliance on the knowledge of bias attributes. Definition of the attributes, however, is elusive and varies across different datasets. In addition, leveraging these attributes at input level to bias mitigation may leave a gap between intrinsic properties and the underlying decision rule. To narrow down this gap and liberate the supervision on bias, we suggest extending bias mitigation into feature space. Therefore, a novel model, Recovering Intended-Feature Subspace with Knowledge-Free (RISK) is developed. Assuming that shortcut features caused by various biases are unintended for prediction, RISK views them as redundant features. When delving into a lower manifold to remove redundancies, RISK reveals that an extremely low-dimensional subspace with intended features can robustly represent the highly biased dataset. Empirical results demonstrate our model can consistently improve model generalization to out-of-distribution set, and achieves a new state-of-the-art performance.\",\"PeriodicalId\":91381,\"journal\":{\"name\":\"Proceedings of COLING. International Conference on Computational Linguistics\",\"volume\":\"27 1\",\"pages\":\"1666-1676\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-09-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of COLING. International Conference on Computational Linguistics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.48550/arXiv.2209.07879\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of COLING. International Conference on Computational Linguistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2209.07879","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

具有显著偏差比例的数据集对在NLU任务上训练可信模型存在威胁。尽管取得了很大的进步,但目前的去偏方法过于依赖于偏差属性的知识。然而,属性的定义是难以捉摸的,并且在不同的数据集上有所不同。此外,在输入级别利用这些属性来减少偏差可能会在内在属性和基本决策规则之间留下差距。为了缩小这一差距并解放对偏差的监督,我们建议将偏差缓解扩展到特征空间。在此基础上,提出了一种新的模型——基于无知识的预期特征子空间恢复模型(RISK)。假设由各种偏差引起的快捷功能对预测来说是无意的,RISK将它们视为冗余功能。当深入到一个较低的流形以消除冗余时,RISK揭示了一个具有预期特征的极低维子空间可以鲁棒地表示高度偏差的数据集。实证结果表明,该模型能够持续提高模型对分布外集的泛化能力,达到了新的水平。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Less Is Better: Recovering Intended-Feature Subspace to Robustify NLU Models
Datasets with significant proportions of bias present threats for training a trustworthy model on NLU tasks. Despite yielding great progress, current debiasing methods impose excessive reliance on the knowledge of bias attributes. Definition of the attributes, however, is elusive and varies across different datasets. In addition, leveraging these attributes at input level to bias mitigation may leave a gap between intrinsic properties and the underlying decision rule. To narrow down this gap and liberate the supervision on bias, we suggest extending bias mitigation into feature space. Therefore, a novel model, Recovering Intended-Feature Subspace with Knowledge-Free (RISK) is developed. Assuming that shortcut features caused by various biases are unintended for prediction, RISK views them as redundant features. When delving into a lower manifold to remove redundancies, RISK reveals that an extremely low-dimensional subspace with intended features can robustly represent the highly biased dataset. Empirical results demonstrate our model can consistently improve model generalization to out-of-distribution set, and achieves a new state-of-the-art performance.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信