Bayes at FigLang 2022 Euphemism Detection shared task: Cost-Sensitive Bayesian Fine-tuning and Venn-Abers Predictors for Robust Training under Class Skewed Distributions

Proceedings of the 3rd Workshop on Figurative Language Processing (FLP) Pub Date : 1900-01-01 DOI:10.18653/v1/2022.flp-1.13

Paul Trust, Kadusabe Provia, Kizito Omala

引用次数: 1

Abstract

Transformers have achieved a state of the art performance across most natural language processing tasks. However the performance of these models degrade when being trained on skewed class distributions (class imbalance) because training tends to be biased towards head classes with most of the data points . Classical methods that have been proposed to handle this problem (re-sampling and re-weighting) often suffer from unstable performance, poor applicability and poor calibration. In this paper, we propose to use Bayesian methods and Venn-Abers predictors for well calibrated and robust training against class imbalance. Our proposed approach improves f1-score of the baseline RoBERTa (A Robustly Optimized Bidirectional Embedding from Transformers Pretraining Approach) model by about 6 points (79.0% against 72.6%) when training with class imbalanced data.

查看原文本刊更多论文

委婉语检测共享任务:类偏斜分布下鲁棒训练的成本敏感贝叶斯微调和Venn-Abers预测

变形金刚已经在大多数自然语言处理任务中取得了最先进的表现。然而，当在倾斜的类分布(类不平衡)上训练时，这些模型的性能会下降，因为训练倾向于在大多数数据点上偏向头部类。传统的处理该问题的方法(重采样和重加权)往往存在性能不稳定、适用性差和校准差的问题。在本文中，我们建议使用贝叶斯方法和Venn-Abers预测器来校准和鲁棒训练以对抗类不平衡。我们提出的方法在使用类不平衡数据进行训练时，将基线RoBERTa (A robust Optimized Bidirectional Embedding from Transformers Pretraining approach)模型的f1分数提高了约6分(79.0%对72.6%)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 3rd Workshop on Figurative Language Processing (FLP)

自引率

0.00%

发文量