T5 for Hate Speech, Augmented Data, and Ensemble

Decis. Sci. Pub Date : 2023-09-22 DOI:10.3390/sci5040037

Tosin Adewumi, Sana Sabah Sabry, Nosheen Abid, Foteini Liwicki, Marcus Liwicki

{"title":"T5 for Hate Speech, Augmented Data, and Ensemble","authors":"Tosin Adewumi, Sana Sabah Sabry, Nosheen Abid, Foteini Liwicki, Marcus Liwicki","doi":"10.3390/sci5040037","DOIUrl":null,"url":null,"abstract":"We conduct relatively extensive investigations of automatic hate speech (HS) detection using different State-of-The-Art (SoTA) baselines across 11 subtasks spanning six different datasets. Our motivation is to determine which of the recent SoTA models is best for automatic hate speech detection and what advantage methods, such as data augmentation and ensemble, may have on the best model, if any. We carry out six cross-task investigations. We achieve new SoTA results on two subtasks—macro F1 scores of 91.73% and 53.21% for subtasks A and B of the HASOC 2020 dataset, surpassing previous SoTA scores of 51.52% and 26.52%, respectively. We achieve near-SoTA results on two others—macro F1 scores of 81.66% for subtask A of the OLID 2019 and 82.54% for subtask A of the HASOC 2021, in comparison to SoTA results of 82.9% and 83.05%, respectively. We perform error analysis and use two eXplainable Artificial Intelligence (XAI) algorithms (Integrated Gradient (IG) and SHapley Additive exPlanations (SHAP)) to reveal how two of the models (Bi-Directional Long Short-Term Memory Network (Bi-LSTM) and Text-to-Text-Transfer Transformer (T5)) make the predictions they do by using examples. Other contributions of this work are: (1) the introduction of a simple, novel mechanism for correcting Out-of-Class (OoC) predictions in T5, (2) a detailed description of the data augmentation methods, and (3) the revelation of the poor data annotations in the HASOC 2021 dataset by using several examples and XAI (buttressing the need for better quality control). We publicly release our model checkpoints and codes to foster transparency.","PeriodicalId":10987,"journal":{"name":"Decis. Sci.","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Decis. Sci.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/sci5040037","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

We conduct relatively extensive investigations of automatic hate speech (HS) detection using different State-of-The-Art (SoTA) baselines across 11 subtasks spanning six different datasets. Our motivation is to determine which of the recent SoTA models is best for automatic hate speech detection and what advantage methods, such as data augmentation and ensemble, may have on the best model, if any. We carry out six cross-task investigations. We achieve new SoTA results on two subtasks—macro F1 scores of 91.73% and 53.21% for subtasks A and B of the HASOC 2020 dataset, surpassing previous SoTA scores of 51.52% and 26.52%, respectively. We achieve near-SoTA results on two others—macro F1 scores of 81.66% for subtask A of the OLID 2019 and 82.54% for subtask A of the HASOC 2021, in comparison to SoTA results of 82.9% and 83.05%, respectively. We perform error analysis and use two eXplainable Artificial Intelligence (XAI) algorithms (Integrated Gradient (IG) and SHapley Additive exPlanations (SHAP)) to reveal how two of the models (Bi-Directional Long Short-Term Memory Network (Bi-LSTM) and Text-to-Text-Transfer Transformer (T5)) make the predictions they do by using examples. Other contributions of this work are: (1) the introduction of a simple, novel mechanism for correcting Out-of-Class (OoC) predictions in T5, (2) a detailed description of the data augmentation methods, and (3) the revelation of the poor data annotations in the HASOC 2021 dataset by using several examples and XAI (buttressing the need for better quality control). We publicly release our model checkpoints and codes to foster transparency.

查看原文本刊更多论文

仇恨言论，增强数据和集成T5

我们在跨越6个不同数据集的11个子任务中使用不同的最先进(SoTA)基线对自动仇恨言论(HS)检测进行了相对广泛的调查。我们的动机是确定最近的SoTA模型中哪一个最适合自动仇恨言论检测，以及数据增强和集成等方法在最佳模型上可能具有什么优势(如果有的话)。我们进行六项跨任务调查。我们在两个子任务上获得了新的SoTA结果——HASOC 2020数据集的子任务A和B的宏F1得分分别为91.73%和53.21%，超过了之前的51.52%和26.52%的SoTA得分。我们在另外两个方面取得了接近SoTA的结果——OLID 2019的子任务A的宏观F1分数为81.66%，HASOC 2021的子任务A的宏观F1分数为82.54%，而SoTA结果分别为82.9%和83.05%。我们进行了错误分析，并使用两种可解释的人工智能(XAI)算法(集成梯度(IG)和SHapley加性解释(SHAP))来揭示两种模型(双向长短期记忆网络(Bi-LSTM)和文本到文本传输转换器(T5))如何通过示例进行预测。这项工作的其他贡献是:(1)在T5中引入了一种简单的、新颖的纠正类外(OoC)预测的机制，(2)对数据增强方法的详细描述，以及(3)通过使用几个示例和XAI揭示了HASOC 2021数据集中的不良数据注释(支持更好的质量控制需求)。我们公开发布我们的模型检查点和代码，以促进透明度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Decis. Sci.

自引率

0.00%

发文量