Generalizing Hate Speech Detection Using Multi-Task Learning: A Case Study of Political Public Figures

IF 3.4 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Computer Speech and Language Pub Date : 2024-07-17 DOI:10.1016/j.csl.2024.101690

Lanqin Yuan, Marian-Andrei Rizoiu

{"title":"Generalizing Hate Speech Detection Using Multi-Task Learning: A Case Study of Political Public Figures","authors":"Lanqin Yuan, Marian-Andrei Rizoiu","doi":"10.1016/j.csl.2024.101690","DOIUrl":null,"url":null,"abstract":"<div><p>Automatic identification of hateful and abusive content is vital in combating the spread of harmful online content and its damaging effects. Most existing works evaluate models by examining the generalization error on train–test splits on hate speech datasets. These datasets often differ in their definitions and labeling criteria, leading to poor generalization performance when predicting across new domains and datasets. This work proposes a new Multi-task Learning (MTL) pipeline that trains simultaneously across multiple hate speech datasets to construct a more encompassing classification model. Using a dataset-level leave-one-out evaluation (designating a dataset for testing and jointly training on all others), we trial the MTL detection on new, previously unseen datasets. Our results consistently outperform a large sample of existing work. We show strong results when examining the generalization error in train–test splits and substantial improvements when predicting on previously unseen datasets. Furthermore, we assemble a novel dataset, dubbed <span>PubFigs</span>, focusing on the problematic speech of American Public Political Figures. We crowdsource-label using Amazon MTurk more than 20,000 tweets and machine-label problematic speech in all the 305,235 tweets in <span>PubFigs</span>. We find that the abusive and hate tweeting mainly originates from right-leaning figures and relates to six topics, including Islam, women, ethnicity, and immigrants. We show that MTL builds embeddings that can simultaneously separate abusive from hate speech, and identify its topics.</p></div>","PeriodicalId":50638,"journal":{"name":"Computer Speech and Language","volume":"89 ","pages":"Article 101690"},"PeriodicalIF":3.4000,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0885230824000731/pdfft?md5=e169fb47936a2284a9d518194884b197&pid=1-s2.0-S0885230824000731-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Speech and Language","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0885230824000731","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Automatic identification of hateful and abusive content is vital in combating the spread of harmful online content and its damaging effects. Most existing works evaluate models by examining the generalization error on train–test splits on hate speech datasets. These datasets often differ in their definitions and labeling criteria, leading to poor generalization performance when predicting across new domains and datasets. This work proposes a new Multi-task Learning (MTL) pipeline that trains simultaneously across multiple hate speech datasets to construct a more encompassing classification model. Using a dataset-level leave-one-out evaluation (designating a dataset for testing and jointly training on all others), we trial the MTL detection on new, previously unseen datasets. Our results consistently outperform a large sample of existing work. We show strong results when examining the generalization error in train–test splits and substantial improvements when predicting on previously unseen datasets. Furthermore, we assemble a novel dataset, dubbed PubFigs, focusing on the problematic speech of American Public Political Figures. We crowdsource-label using Amazon MTurk more than 20,000 tweets and machine-label problematic speech in all the 305,235 tweets in PubFigs. We find that the abusive and hate tweeting mainly originates from right-leaning figures and relates to six topics, including Islam, women, ethnicity, and immigrants. We show that MTL builds embeddings that can simultaneously separate abusive from hate speech, and identify its topics.

查看原文本刊更多论文

利用多任务学习实现仇恨言论检测的泛化：政治公众人物案例研究

自动识别仇恨和辱骂内容对于打击有害网络内容的传播及其破坏性影响至关重要。现有的大多数工作都是通过检查仇恨言论数据集上训练-测试分裂的泛化误差来评估模型的。这些数据集的定义和标记标准往往不同，导致在预测新领域和数据集时泛化性能较差。本研究提出了一种新的多任务学习（MTL）管道，可同时在多个仇恨言论数据集上进行训练，以构建一个更全面的分类模型。我们使用数据集级的 "留一弃一 "评估（指定一个数据集进行测试，并在所有其他数据集上进行联合训练），在以前未见过的新数据集上试用 MTL 检测。我们的结果始终优于大量现有工作。在对训练-测试分离的泛化误差进行检查时，我们显示出了很好的结果，而在对以前未见过的数据集进行预测时，我们的结果也有了很大的改进。此外，我们还建立了一个名为 PubFigs 的新数据集，重点关注美国公众政治人物的问题言论。我们使用亚马逊 MTurk 对 20,000 多条推文进行了众包标注，并对 PubFigs 中所有 305,235 条推文中的问题言论进行了机器标注。我们发现，辱骂性和仇恨性推文主要来自右倾人物，涉及伊斯兰教、妇女、种族和移民等六个主题。我们的研究表明，MTL 建立的嵌入可以同时区分辱骂性和仇恨性言论，并识别其主题。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computer Speech and Language 工程技术-计算机：人工智能

CiteScore

11.30

自引率

4.70%

发文量

审稿时长

22.9 weeks

期刊介绍： Computer Speech & Language publishes reports of original research related to the recognition, understanding, production, coding and mining of speech and language. The speech and language sciences have a long history, but it is only relatively recently that large-scale implementation of and experimentation with complex models of speech and language processing has become feasible. Such research is often carried out somewhat separately by practitioners of artificial intelligence, computer science, electronic engineering, information retrieval, linguistics, phonetics, or psychology.