Improving Hate Speech Classification Through Ensemble Learning and Explainable AI Techniques

IF 2.9 4区综合性期刊 Q1 Multidisciplinary

Arabian Journal for Science and Engineering Pub Date : 2024-09-16 DOI:10.1007/s13369-024-09540-2

Priya Garg, M. K. Sharma, Parteek Kumar

{"title":"Improving Hate Speech Classification Through Ensemble Learning and Explainable AI Techniques","authors":"Priya Garg, M. K. Sharma, Parteek Kumar","doi":"10.1007/s13369-024-09540-2","DOIUrl":null,"url":null,"abstract":"<p>Identifying offensive and discriminatory content, commonly referred to as hate speech, within textual data is a critical task. This study addresses the task of identifying hate speech in textual data, focusing on the challenge of selecting optimal word embedding methods and classifiers. Leveraging the Google Jigsaw dataset, the research employs explainable artificial intelligence (XAI) for hate speech detection. Following preprocessing, which includes converting text to lowercase, removing punctuation, extra whitespace, numbers, and non-ASCII characters, a thorough analysis reveals high-frequency words. The research extensively compares three-word embedding techniques—CountVectorizer, GloVe, and bidirectional encoder representations from transformers (BERT)—in combination with two machine learning models (support vector classifier and logistic regression) and four deep learning models [artificial neural network (ANN), recurrent neural network (RNN), bidirectional gated recurrent unit (Bi-GRU), bidirectional long-short term memory (Bi-LSTM)] for hate speech detection. The fusion of BERT with a bidirectional gated recurrent unit (Bi-GRU) achieved an impressive accuracy of 92%, and an ensemble of the top-performing models further improves accuracy by nearly 2%. To enhance result interpretability, the study employs XAI techniques such as local interpretable model agnostic explanations (LIME) and Shapley additive explanations (SHAP) on the top-performing ensembled model to provide insights into its predictions. The paper concludes by suggesting potential future research directions, including exploring additional embedding techniques and models, addressing dataset generalizability, improving interpretability methods, and considering computational resource constraints.</p>","PeriodicalId":8109,"journal":{"name":"Arabian Journal for Science and Engineering","volume":"31 1","pages":""},"PeriodicalIF":2.9000,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Arabian Journal for Science and Engineering","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1007/s13369-024-09540-2","RegionNum":4,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Multidisciplinary","Score":null,"Total":0}

引用次数: 0

Abstract

Identifying offensive and discriminatory content, commonly referred to as hate speech, within textual data is a critical task. This study addresses the task of identifying hate speech in textual data, focusing on the challenge of selecting optimal word embedding methods and classifiers. Leveraging the Google Jigsaw dataset, the research employs explainable artificial intelligence (XAI) for hate speech detection. Following preprocessing, which includes converting text to lowercase, removing punctuation, extra whitespace, numbers, and non-ASCII characters, a thorough analysis reveals high-frequency words. The research extensively compares three-word embedding techniques—CountVectorizer, GloVe, and bidirectional encoder representations from transformers (BERT)—in combination with two machine learning models (support vector classifier and logistic regression) and four deep learning models [artificial neural network (ANN), recurrent neural network (RNN), bidirectional gated recurrent unit (Bi-GRU), bidirectional long-short term memory (Bi-LSTM)] for hate speech detection. The fusion of BERT with a bidirectional gated recurrent unit (Bi-GRU) achieved an impressive accuracy of 92%, and an ensemble of the top-performing models further improves accuracy by nearly 2%. To enhance result interpretability, the study employs XAI techniques such as local interpretable model agnostic explanations (LIME) and Shapley additive explanations (SHAP) on the top-performing ensembled model to provide insights into its predictions. The paper concludes by suggesting potential future research directions, including exploring additional embedding techniques and models, addressing dataset generalizability, improving interpretability methods, and considering computational resource constraints.

Abstract Image

查看原文本刊更多论文

通过集合学习和可解释人工智能技术改进仇恨言论分类

在文本数据中识别攻击性和歧视性内容（通常称为仇恨言论）是一项至关重要的任务。本研究探讨了在文本数据中识别仇恨言论的任务，重点关注选择最佳单词嵌入方法和分类器的挑战。研究利用 Google Jigsaw 数据集，采用可解释人工智能 (XAI) 进行仇恨言论检测。预处理包括将文本转换为小写字母，去除标点符号、多余的空白、数字和非 ASCII 字符，然后进行全面分析，找出高频词。研究广泛比较了三种单词嵌入技术--CountVectorizer、GloVe 和来自变换器的双向编码器表征（BERT）--与两种机器学习模型（支持向量分类器和逻辑回归）和四种深度学习模型（人工神经网络（ANN）、循环神经网络（RNN）、双向门控循环单元（Bi-GRU）、双向长短期记忆（Bi-LSTM））的结合，用于仇恨言论检测。BERT 与双向门控递归单元（Bi-GRU）的融合达到了令人印象深刻的 92% 的准确率，而通过对表现最佳的模型进行组合，准确率进一步提高了近 2%。为了提高结果的可解释性，该研究对表现最好的集合模型采用了 XAI 技术，如局部可解释模型不可知解释（LIME）和夏普利加法解释（SHAP），以深入了解其预测结果。论文最后提出了潜在的未来研究方向，包括探索更多嵌入技术和模型、解决数据集普适性问题、改进可解释性方法以及考虑计算资源限制。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Arabian Journal for Science and Engineering 综合性期刊-综合性期刊

CiteScore

5.20

自引率

3.40%

发文量

审稿时长

4.3 months

期刊介绍： King Fahd University of Petroleum & Minerals (KFUPM) partnered with Springer to publish the Arabian Journal for Science and Engineering (AJSE). AJSE, which has been published by KFUPM since 1975, is a recognized national, regional and international journal that provides a great opportunity for the dissemination of research advances from the Kingdom of Saudi Arabia, MENA and the world.