How do Machine Learning Algorithms Effectively Classify Toxic Comments? An Empirical Analysis

Q3 Computer Science

International Journal of Intelligent Systems and Applications in Engineering Pub Date : 2023-08-08 DOI:10.5815/ijisa.2023.04.01

Md. Abdur Rahman, A. Nayem, Mahfida Amjad, Md. Saeed Siddik

{"title":"How do Machine Learning Algorithms Effectively Classify Toxic Comments? An Empirical Analysis","authors":"Md. Abdur Rahman, A. Nayem, Mahfida Amjad, Md. Saeed Siddik","doi":"10.5815/ijisa.2023.04.01","DOIUrl":null,"url":null,"abstract":"Toxic comments on social media platforms, news portals, and online forums are impolite, insulting, or unreasonable that usually make other users leave a conversation. Due to the significant number of comments, it is impractical to moderate them manually. Therefore, online service providers use the automatic detection of toxicity using Machine Learning (ML) algorithms. However, the model's toxicity identification performance relies on the best combination of classifier and feature extraction techniques. In this empirical study, we set up a comparison environment for toxic comment classification using 15 frequently used supervised ML classifiers with the four most prominent feature extraction schemes. We considered the publicly available Jigsaw dataset on toxic comments written by human users. We tested, analyzed and compared with every pair of investigated classifiers and finally reported a conclusion. We used the accuracy and area under the ROC curve as the evaluation metrics. We revealed that Logistic Regression and AdaBoost are the best toxic comment classifiers. The average accuracy of Logistic Regression and AdaBoost is 0.895 and 0.893, respectively, where both achieved the same area under the ROC curve score (i.e., 0.828). Therefore, the primary takeaway of this study is that the Logistic Regression and Adaboost leveraging BoW, TF-IDF, or Hashing features can perform sufficiently for toxic comment classification.","PeriodicalId":14067,"journal":{"name":"International Journal of Intelligent Systems and Applications in Engineering","volume":"30 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Intelligent Systems and Applications in Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5815/ijisa.2023.04.01","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Computer Science","Score":null,"Total":0}

引用次数: 0

Abstract

Toxic comments on social media platforms, news portals, and online forums are impolite, insulting, or unreasonable that usually make other users leave a conversation. Due to the significant number of comments, it is impractical to moderate them manually. Therefore, online service providers use the automatic detection of toxicity using Machine Learning (ML) algorithms. However, the model's toxicity identification performance relies on the best combination of classifier and feature extraction techniques. In this empirical study, we set up a comparison environment for toxic comment classification using 15 frequently used supervised ML classifiers with the four most prominent feature extraction schemes. We considered the publicly available Jigsaw dataset on toxic comments written by human users. We tested, analyzed and compared with every pair of investigated classifiers and finally reported a conclusion. We used the accuracy and area under the ROC curve as the evaluation metrics. We revealed that Logistic Regression and AdaBoost are the best toxic comment classifiers. The average accuracy of Logistic Regression and AdaBoost is 0.895 and 0.893, respectively, where both achieved the same area under the ROC curve score (i.e., 0.828). Therefore, the primary takeaway of this study is that the Logistic Regression and Adaboost leveraging BoW, TF-IDF, or Hashing features can perform sufficiently for toxic comment classification.

查看原文本刊更多论文

机器学习算法如何有效分类有毒评论?实证分析

社交媒体平台、新闻门户网站和在线论坛上的有毒评论是不礼貌的、侮辱性的或不合理的，通常会让其他用户离开对话。由于大量的评论，手动调节它们是不切实际的。因此，在线服务提供商使用机器学习(ML)算法自动检测毒性。然而，该模型的毒性识别性能依赖于分类器和特征提取技术的最佳结合。在本实证研究中，我们使用15种常用的有监督机器学习分类器和四种最突出的特征提取方案建立了有毒评论分类的比较环境。我们考虑了人类用户写的有毒评论的公开可用的Jigsaw数据集。我们对每一对被调查的分类器进行测试、分析和比较，最后报告一个结论。我们使用ROC曲线下的准确度和面积作为评价指标。我们发现Logistic回归和AdaBoost是最好的有毒评论分类器。Logistic回归和AdaBoost的平均准确率分别为0.895和0.893，两者在ROC曲线得分下的面积相同(即0.828)。因此，本研究的主要结论是，逻辑回归和Adaboost利用BoW、TF-IDF或哈希特征可以充分执行有毒评论分类。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Journal of Intelligent Systems and Applications in Engineering Computer Science-Computer Graphics and Computer-Aided Design

CiteScore

1.30

自引率

0.00%

发文量