Hate Speech Identification in Text Written in Indonesian with Recurrent Neural Network

2019 International Conference on Advanced Computer Science and information Systems (ICACSIS) Pub Date : 2019-10-01 DOI:10.1109/ICACSIS47736.2019.8979959

Erryan Sazany, I. Budi

{"title":"Hate Speech Identification in Text Written in Indonesian with Recurrent Neural Network","authors":"Erryan Sazany, I. Budi","doi":"10.1109/ICACSIS47736.2019.8979959","DOIUrl":null,"url":null,"abstract":"Some researches had succeeded in doing hate speech identification automatically from text with machine learning and deep learning approaches. However, it was still unclear how adaptive is a deep learning-based model if it is tested on a different set of text data with different domain. To address this issue, this research proposed some deep learning-based methods, using some variants of Recurrent Neural Network to identify hate speech in texts sourced from Twitter, and then used to predict other set of text data sourced from Facebook and Twitter. The experiment was done in order to measure the difference of model performance between training phase and testing phase. Experiment results showed that the proposed method outperformed the machine learning based methods, both in training phase, by GRU algorithm with 85.37% F1-score, and in testing phase, by LSTM algorithm with 76.30% F1-score. Then, in terms of adaptability of model performance, the proposed method gave comparable result against the baseline method.","PeriodicalId":165090,"journal":{"name":"2019 International Conference on Advanced Computer Science and information Systems (ICACSIS)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Advanced Computer Science and information Systems (ICACSIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICACSIS47736.2019.8979959","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 10

Abstract

Some researches had succeeded in doing hate speech identification automatically from text with machine learning and deep learning approaches. However, it was still unclear how adaptive is a deep learning-based model if it is tested on a different set of text data with different domain. To address this issue, this research proposed some deep learning-based methods, using some variants of Recurrent Neural Network to identify hate speech in texts sourced from Twitter, and then used to predict other set of text data sourced from Facebook and Twitter. The experiment was done in order to measure the difference of model performance between training phase and testing phase. Experiment results showed that the proposed method outperformed the machine learning based methods, both in training phase, by GRU algorithm with 85.37% F1-score, and in testing phase, by LSTM algorithm with 76.30% F1-score. Then, in terms of adaptability of model performance, the proposed method gave comparable result against the baseline method.

查看原文本刊更多论文

用递归神经网络识别印尼语文本中的仇恨言论

一些研究已经成功地利用机器学习和深度学习方法从文本中自动识别仇恨言论。然而，如果在不同领域的不同文本数据集上进行测试，那么基于深度学习的模型的适应性如何仍然不清楚。为了解决这个问题，本研究提出了一些基于深度学习的方法，使用递归神经网络的一些变体来识别来自Twitter的文本中的仇恨言论，然后用于预测来自Facebook和Twitter的其他文本数据集。实验是为了衡量模型在训练阶段和测试阶段的性能差异。实验结果表明，该方法在训练阶段的GRU算法和测试阶段的LSTM算法分别以85.37%和76.30%的f1得分优于基于机器学习的方法。然后，在模型性能的适应性方面，该方法与基线方法具有可比性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2019 International Conference on Advanced Computer Science and information Systems (ICACSIS)

自引率

0.00%

发文量