{"title":"通过使用变形语言模型来检测社交媒体上的情绪、仇恨言论和攻击性语言,打击网络骚扰","authors":"Doorgesh Sookarah, Loovesh S. Ramwodin","doi":"10.1109/ELECOM54934.2022.9965237","DOIUrl":null,"url":null,"abstract":"In these contemporary times, social media is omnipresent and most people adhere to at least one of these digital platforms. Social entertainment generates an enormous amount of data and this is an unparalleled opportunity for data scientists and linguistic experts. These factors have renewed the interest in Natural Language Processing techniques and as such, there is a continuous increase in the number of publications that deal with the topic of Tweet classification using machine learning models. In this paper, experiments performed by the TweetEval team from the University of Cardiff have been studied and expanded upon. These tasks include emotion detection, offensive language identification and hate speech detection. The decision was made to focus on these specific classification tasks as they directly relate to unsought behaviours such as online harassment. This research endeavour involved building and testing a transformer-based language model which is capable of matching the performance of TweetEval. The aim of this study is therefore to identify common limitations to such models and how these can be circumvented to effectively combat phenomenon such as cyberbullying and online abuse using machine learning. From the results that were obtained, the developed BERT model performed comparatively well to other similar algorithms for all tasks as the obtained results were an F1-Score of 0.51, 0.76 and 0.80 for hate speech, emotion detection and offensive language respectively.","PeriodicalId":302869,"journal":{"name":"2022 4th International Conference on Emerging Trends in Electrical, Electronic and Communications Engineering (ELECOM)","volume":"61 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Combatting online harassment by using transformer language models for the detection of emotions, hate speech and offensive language on social media\",\"authors\":\"Doorgesh Sookarah, Loovesh S. Ramwodin\",\"doi\":\"10.1109/ELECOM54934.2022.9965237\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In these contemporary times, social media is omnipresent and most people adhere to at least one of these digital platforms. Social entertainment generates an enormous amount of data and this is an unparalleled opportunity for data scientists and linguistic experts. These factors have renewed the interest in Natural Language Processing techniques and as such, there is a continuous increase in the number of publications that deal with the topic of Tweet classification using machine learning models. In this paper, experiments performed by the TweetEval team from the University of Cardiff have been studied and expanded upon. These tasks include emotion detection, offensive language identification and hate speech detection. The decision was made to focus on these specific classification tasks as they directly relate to unsought behaviours such as online harassment. This research endeavour involved building and testing a transformer-based language model which is capable of matching the performance of TweetEval. The aim of this study is therefore to identify common limitations to such models and how these can be circumvented to effectively combat phenomenon such as cyberbullying and online abuse using machine learning. From the results that were obtained, the developed BERT model performed comparatively well to other similar algorithms for all tasks as the obtained results were an F1-Score of 0.51, 0.76 and 0.80 for hate speech, emotion detection and offensive language respectively.\",\"PeriodicalId\":302869,\"journal\":{\"name\":\"2022 4th International Conference on Emerging Trends in Electrical, Electronic and Communications Engineering (ELECOM)\",\"volume\":\"61 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 4th International Conference on Emerging Trends in Electrical, Electronic and Communications Engineering (ELECOM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ELECOM54934.2022.9965237\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 4th International Conference on Emerging Trends in Electrical, Electronic and Communications Engineering (ELECOM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ELECOM54934.2022.9965237","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Combatting online harassment by using transformer language models for the detection of emotions, hate speech and offensive language on social media
In these contemporary times, social media is omnipresent and most people adhere to at least one of these digital platforms. Social entertainment generates an enormous amount of data and this is an unparalleled opportunity for data scientists and linguistic experts. These factors have renewed the interest in Natural Language Processing techniques and as such, there is a continuous increase in the number of publications that deal with the topic of Tweet classification using machine learning models. In this paper, experiments performed by the TweetEval team from the University of Cardiff have been studied and expanded upon. These tasks include emotion detection, offensive language identification and hate speech detection. The decision was made to focus on these specific classification tasks as they directly relate to unsought behaviours such as online harassment. This research endeavour involved building and testing a transformer-based language model which is capable of matching the performance of TweetEval. The aim of this study is therefore to identify common limitations to such models and how these can be circumvented to effectively combat phenomenon such as cyberbullying and online abuse using machine learning. From the results that were obtained, the developed BERT model performed comparatively well to other similar algorithms for all tasks as the obtained results were an F1-Score of 0.51, 0.76 and 0.80 for hate speech, emotion detection and offensive language respectively.