{"title":"识别在线空间中有毒语言的机器学习方法","authors":"Lisa Kaati, A. Shrestha, N. Akrami","doi":"10.1109/ASONAM55673.2022.10068619","DOIUrl":null,"url":null,"abstract":"In this study, we trained three machine learning models to detect toxic language on social media. These models were trained using data from diverse sources to ensure that the models have a broad understanding of toxic language. Next, we evaluate the performance of our models on a dataset with samples of data from a large number of diverse online forums. The test dataset was annotated by three independent annotators. We also compared the performance of our models with Perspective API - a toxic language detection model created by Jigsaw and Google's Counter Abuse Technology team. The results showed that our classification models performed well on data from the domains they were trained on (Fl = 0.91, 0.91, & 0.84, for the RoBERTa, BERT, & SVM respectively), but the performance decreased when they were tested on annotated data from new domains (Fl = 0.80, 0.61, 0.49, & 0.77, for the RoBERTa, BERT, SVM, & Google perspective, respectively). Finally, we used the best-performing model on the test data (RoBERTa, ROC = 0.86) to examine the frequency (/proportion) of toxic language in 21 diverse forums. The results of these analyses showed that forums for general discussions with moderation (e.g., Alternate history) had much lower proportions of toxic language compared to those with minimal moderation (e.g., 8Kun). Although highlighting the complexity of detecting toxic language, our results show that model performance can be improved by using a diverse dataset when building new models. We conclude by discussing the implication of our findings and some directions for future research.","PeriodicalId":423113,"journal":{"name":"2022 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A Machine Learning Approach to Identify Toxic Language in the Online Space\",\"authors\":\"Lisa Kaati, A. Shrestha, N. Akrami\",\"doi\":\"10.1109/ASONAM55673.2022.10068619\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this study, we trained three machine learning models to detect toxic language on social media. These models were trained using data from diverse sources to ensure that the models have a broad understanding of toxic language. Next, we evaluate the performance of our models on a dataset with samples of data from a large number of diverse online forums. The test dataset was annotated by three independent annotators. We also compared the performance of our models with Perspective API - a toxic language detection model created by Jigsaw and Google's Counter Abuse Technology team. The results showed that our classification models performed well on data from the domains they were trained on (Fl = 0.91, 0.91, & 0.84, for the RoBERTa, BERT, & SVM respectively), but the performance decreased when they were tested on annotated data from new domains (Fl = 0.80, 0.61, 0.49, & 0.77, for the RoBERTa, BERT, SVM, & Google perspective, respectively). Finally, we used the best-performing model on the test data (RoBERTa, ROC = 0.86) to examine the frequency (/proportion) of toxic language in 21 diverse forums. The results of these analyses showed that forums for general discussions with moderation (e.g., Alternate history) had much lower proportions of toxic language compared to those with minimal moderation (e.g., 8Kun). Although highlighting the complexity of detecting toxic language, our results show that model performance can be improved by using a diverse dataset when building new models. We conclude by discussing the implication of our findings and some directions for future research.\",\"PeriodicalId\":423113,\"journal\":{\"name\":\"2022 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)\",\"volume\":\"5 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ASONAM55673.2022.10068619\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASONAM55673.2022.10068619","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Machine Learning Approach to Identify Toxic Language in the Online Space
In this study, we trained three machine learning models to detect toxic language on social media. These models were trained using data from diverse sources to ensure that the models have a broad understanding of toxic language. Next, we evaluate the performance of our models on a dataset with samples of data from a large number of diverse online forums. The test dataset was annotated by three independent annotators. We also compared the performance of our models with Perspective API - a toxic language detection model created by Jigsaw and Google's Counter Abuse Technology team. The results showed that our classification models performed well on data from the domains they were trained on (Fl = 0.91, 0.91, & 0.84, for the RoBERTa, BERT, & SVM respectively), but the performance decreased when they were tested on annotated data from new domains (Fl = 0.80, 0.61, 0.49, & 0.77, for the RoBERTa, BERT, SVM, & Google perspective, respectively). Finally, we used the best-performing model on the test data (RoBERTa, ROC = 0.86) to examine the frequency (/proportion) of toxic language in 21 diverse forums. The results of these analyses showed that forums for general discussions with moderation (e.g., Alternate history) had much lower proportions of toxic language compared to those with minimal moderation (e.g., 8Kun). Although highlighting the complexity of detecting toxic language, our results show that model performance can be improved by using a diverse dataset when building new models. We conclude by discussing the implication of our findings and some directions for future research.