A Deep Decision Forests Model for Hate Speech Detection

IF 1.2 Q4 COMPUTER SCIENCE, INFORMATION SYSTEMS

Jordanian Journal of Computers and Information Technology Pub Date : 2023-01-01 DOI:10.5455/jjcit.71-1667394363

M. Ndenga

引用次数: 0

Abstract

Detecting and controlling propagation of hate-speech over social media platforms is a challenge. This problem is exacerbated by extreme fast flow, readily available audience, and relative permanence of information on social media. The objective of this research is to propose a model that could be used to detect political hate speech that is propagated through social media platforms in Kenya. Using Twitter textual data and Keras TensorFlow Decision Forests (TF-DF), three models were developed i.e., Gradient Boosted Trees with Universal Sentence Embeddings(USE), Gradient Boosted Trees, and Random Forest respectively. The Gradient Boosted Trees with USE model exhibited a superior performance with an accuracy of 98.86%, recall of 0.9587, precision of 0.9831, and AUC of 0.9984. Therefore, this model can be utilized for detecting hate speech on social media platforms.

查看原文本刊更多论文

仇恨语音检测的深度决策森林模型

检测和控制社交媒体平台上仇恨言论的传播是一项挑战。社交媒体上的信息流动极快、容易获得的受众和相对永久性加剧了这一问题。本研究的目的是提出一个模型，可用于检测通过肯尼亚社交媒体平台传播的政治仇恨言论。利用Twitter文本数据和Keras TensorFlow决策森林(TF-DF)，分别开发了具有通用句子嵌入的梯度增强树(USE)、梯度增强树和随机森林三种模型。使用USE模型的梯度增强树的准确率为98.86%，召回率为0.9587，精密度为0.9831,AUC为0.9984。因此，该模型可以用于检测社交媒体平台上的仇恨言论。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Jordanian Journal of Computers and Information Technology Computer Science-Computer Science (all)

CiteScore

3.10

自引率

25.00%

发文量