Mohammed Taleb, Alami Hamza, Mohamed Zouitni, Nabil Burmani, Said Lafkiar, Noureddine En-Nahnahi
{"title":"Detection of toxicity in social media based on Natural Language Processing methods","authors":"Mohammed Taleb, Alami Hamza, Mohamed Zouitni, Nabil Burmani, Said Lafkiar, Noureddine En-Nahnahi","doi":"10.1109/ISCV54655.2022.9806096","DOIUrl":null,"url":null,"abstract":"Comments on important websites, such as popular news portals or social media platforms, are among the main ways of virtual interaction. Unfortunately, the behavior of users on these websites often becomes rude or disrespectful, by spreading toxic comments which can muddle the proper functioning of these sites. The aim of this research is to detect these toxic comments, and to find parts, toxic spans, of these comments to which toxicity can be attributed. Thus, we explored and compared various classifiers belonging to three categories “Machine Learning, Ensemble Learning and Deep Learning” and using different text representations. For detecting toxic spans in the comments, we applied an unsupervised method, we apply the Local Interpretable Model-Agnostic Explanations (LIME).The measures we used to evaluate our methods are accuracy, recall, and Fl-score. Our experiments showed that deep learning models performed unquestionably in the task of detecting toxic comments. The LSTM models with the Globe representation and LSTM with FastText were able to produce a higher F1 and accuracy compared to the other models used. For Toxic spans detction, the higher scores were obtained when combining LIME with classifier LSTM(GloVe) with an accuracy of 98% to identify the toxic spans.","PeriodicalId":426665,"journal":{"name":"2022 International Conference on Intelligent Systems and Computer Vision (ISCV)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Intelligent Systems and Computer Vision (ISCV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCV54655.2022.9806096","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Comments on important websites, such as popular news portals or social media platforms, are among the main ways of virtual interaction. Unfortunately, the behavior of users on these websites often becomes rude or disrespectful, by spreading toxic comments which can muddle the proper functioning of these sites. The aim of this research is to detect these toxic comments, and to find parts, toxic spans, of these comments to which toxicity can be attributed. Thus, we explored and compared various classifiers belonging to three categories “Machine Learning, Ensemble Learning and Deep Learning” and using different text representations. For detecting toxic spans in the comments, we applied an unsupervised method, we apply the Local Interpretable Model-Agnostic Explanations (LIME).The measures we used to evaluate our methods are accuracy, recall, and Fl-score. Our experiments showed that deep learning models performed unquestionably in the task of detecting toxic comments. The LSTM models with the Globe representation and LSTM with FastText were able to produce a higher F1 and accuracy compared to the other models used. For Toxic spans detction, the higher scores were obtained when combining LIME with classifier LSTM(GloVe) with an accuracy of 98% to identify the toxic spans.