Wan Noor Anira Wan Ali, M. Mohd, F. Fauzi, Kiyoaki Shirai, Muhammad Junaidi Mahamad Noor
{"title":"利用机器学习方法实现超参数优化和超采样检测网络欺凌","authors":"Wan Noor Anira Wan Ali, M. Mohd, F. Fauzi, Kiyoaki Shirai, Muhammad Junaidi Mahamad Noor","doi":"10.22452/mjcs.sp2021no2.6","DOIUrl":null,"url":null,"abstract":"Online social networks have become a necessity to everyone around the world. Particularly, online social networks have enabled us to connect to one another regardless of time, for as long as we have social media and social networking as platforms for broadcasting information and communicating, respectively. However, this evolution has resulted in people possibly committing various cybercrimes, such as cyberbullying. To address this issue, machine learning can be utilised to counter cyberbullying in online social networks. Thus, this study proposed a framework with a set of features consisting of word and character term frequency–inverse document frequency and word embedding by using Word2vec and six types of list terms: profane words, proper nouns, negation words, ‘allness’ term, diminisher words and intensifier words. These features were divided into four groups before being fed into the linear support vector classifier to train our model using ASKfm as data set in hyperparameter tuning and over-sampling environment. Results indicated that the proposed framework provided significant outcomes, in which the highest percentage of area under curve is 99.24% and F-measure is 97.38% as performed by our trained model.","PeriodicalId":49894,"journal":{"name":"Malaysian Journal of Computer Science","volume":" ","pages":""},"PeriodicalIF":1.1000,"publicationDate":"2021-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"IMPLEMENTATION OF HYPERPARAMETER OPTIMISATION AND OVER-SAMPLING IN DETECTING CYBERBULLYING USING MACHINE LEARNING APPROACH\",\"authors\":\"Wan Noor Anira Wan Ali, M. Mohd, F. Fauzi, Kiyoaki Shirai, Muhammad Junaidi Mahamad Noor\",\"doi\":\"10.22452/mjcs.sp2021no2.6\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Online social networks have become a necessity to everyone around the world. Particularly, online social networks have enabled us to connect to one another regardless of time, for as long as we have social media and social networking as platforms for broadcasting information and communicating, respectively. However, this evolution has resulted in people possibly committing various cybercrimes, such as cyberbullying. To address this issue, machine learning can be utilised to counter cyberbullying in online social networks. Thus, this study proposed a framework with a set of features consisting of word and character term frequency–inverse document frequency and word embedding by using Word2vec and six types of list terms: profane words, proper nouns, negation words, ‘allness’ term, diminisher words and intensifier words. These features were divided into four groups before being fed into the linear support vector classifier to train our model using ASKfm as data set in hyperparameter tuning and over-sampling environment. Results indicated that the proposed framework provided significant outcomes, in which the highest percentage of area under curve is 99.24% and F-measure is 97.38% as performed by our trained model.\",\"PeriodicalId\":49894,\"journal\":{\"name\":\"Malaysian Journal of Computer Science\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":1.1000,\"publicationDate\":\"2021-12-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Malaysian Journal of Computer Science\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.22452/mjcs.sp2021no2.6\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Malaysian Journal of Computer Science","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.22452/mjcs.sp2021no2.6","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
IMPLEMENTATION OF HYPERPARAMETER OPTIMISATION AND OVER-SAMPLING IN DETECTING CYBERBULLYING USING MACHINE LEARNING APPROACH
Online social networks have become a necessity to everyone around the world. Particularly, online social networks have enabled us to connect to one another regardless of time, for as long as we have social media and social networking as platforms for broadcasting information and communicating, respectively. However, this evolution has resulted in people possibly committing various cybercrimes, such as cyberbullying. To address this issue, machine learning can be utilised to counter cyberbullying in online social networks. Thus, this study proposed a framework with a set of features consisting of word and character term frequency–inverse document frequency and word embedding by using Word2vec and six types of list terms: profane words, proper nouns, negation words, ‘allness’ term, diminisher words and intensifier words. These features were divided into four groups before being fed into the linear support vector classifier to train our model using ASKfm as data set in hyperparameter tuning and over-sampling environment. Results indicated that the proposed framework provided significant outcomes, in which the highest percentage of area under curve is 99.24% and F-measure is 97.38% as performed by our trained model.
期刊介绍:
The Malaysian Journal of Computer Science (ISSN 0127-9084) is published four times a year in January, April, July and October by the Faculty of Computer Science and Information Technology, University of Malaya, since 1985. Over the years, the journal has gained popularity and the number of paper submissions has increased steadily. The rigorous reviews from the referees have helped in ensuring that the high standard of the journal is maintained. The objectives are to promote exchange of information and knowledge in research work, new inventions/developments of Computer Science and on the use of Information Technology towards the structuring of an information-rich society and to assist the academic staff from local and foreign universities, business and industrial sectors, government departments and academic institutions on publishing research results and studies in Computer Science and Information Technology through a scholarly publication. The journal is being indexed and abstracted by Clarivate Analytics'' Web of Science and Elsevier''s Scopus