{"title":"泰国社交网络中网络骚扰文本的分类和关键词提取","authors":"Siranuch Hemtanon, Ketsara Phetkrachang, Wachira Yangyuen","doi":"10.11591/eei.v12i6.5939","DOIUrl":null,"url":null,"abstract":"Online harassment in social network services (SNS) is a type of cyberbullying issue that needs to be addressed and required preventive measures. In this paper, we develop a detection of cyberbullying regarding harassment textual posts in Thai on the Facebook SNS. We collect public posts and ask experts to label the post as positive or negative regarding harassment posts or not. The annotated data are trained for binary classification considering words in the centre as features to predict malicious intent to insult and threaten other users. The information gain score obtained in generating a prediction model is ranked for the top 20 words with the highest score as significant words involving online harassment. From experiments, the results show that the detection performance obtained a 0.78 f1 score on average. The result analysis indicated that the word surface approach helps detect insulting post decently, but some posts with metaphor to tone down the malicious intent may not be detected as harmful semantic intent are hidden behind word form. Top-20 significant words for bullying showed that bullying posts were body-shaming and lower social status.","PeriodicalId":37619,"journal":{"name":"Bulletin of Electrical Engineering and Informatics","volume":"3 24","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Classification and keyword extraction of online harassment text in Thai social network\",\"authors\":\"Siranuch Hemtanon, Ketsara Phetkrachang, Wachira Yangyuen\",\"doi\":\"10.11591/eei.v12i6.5939\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Online harassment in social network services (SNS) is a type of cyberbullying issue that needs to be addressed and required preventive measures. In this paper, we develop a detection of cyberbullying regarding harassment textual posts in Thai on the Facebook SNS. We collect public posts and ask experts to label the post as positive or negative regarding harassment posts or not. The annotated data are trained for binary classification considering words in the centre as features to predict malicious intent to insult and threaten other users. The information gain score obtained in generating a prediction model is ranked for the top 20 words with the highest score as significant words involving online harassment. From experiments, the results show that the detection performance obtained a 0.78 f1 score on average. The result analysis indicated that the word surface approach helps detect insulting post decently, but some posts with metaphor to tone down the malicious intent may not be detected as harmful semantic intent are hidden behind word form. Top-20 significant words for bullying showed that bullying posts were body-shaming and lower social status.\",\"PeriodicalId\":37619,\"journal\":{\"name\":\"Bulletin of Electrical Engineering and Informatics\",\"volume\":\"3 24\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Bulletin of Electrical Engineering and Informatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.11591/eei.v12i6.5939\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"Mathematics\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bulletin of Electrical Engineering and Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.11591/eei.v12i6.5939","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Mathematics","Score":null,"Total":0}
Classification and keyword extraction of online harassment text in Thai social network
Online harassment in social network services (SNS) is a type of cyberbullying issue that needs to be addressed and required preventive measures. In this paper, we develop a detection of cyberbullying regarding harassment textual posts in Thai on the Facebook SNS. We collect public posts and ask experts to label the post as positive or negative regarding harassment posts or not. The annotated data are trained for binary classification considering words in the centre as features to predict malicious intent to insult and threaten other users. The information gain score obtained in generating a prediction model is ranked for the top 20 words with the highest score as significant words involving online harassment. From experiments, the results show that the detection performance obtained a 0.78 f1 score on average. The result analysis indicated that the word surface approach helps detect insulting post decently, but some posts with metaphor to tone down the malicious intent may not be detected as harmful semantic intent are hidden behind word form. Top-20 significant words for bullying showed that bullying posts were body-shaming and lower social status.
期刊介绍:
Bulletin of Electrical Engineering and Informatics publishes original papers in the field of electrical, computer and informatics engineering which covers, but not limited to, the following scope: Computer Science, Computer Engineering and Informatics[...] Electronics[...] Electrical and Power Engineering[...] Telecommunication and Information Technology[...]Instrumentation and Control Engineering[...]