基于FormSpring文本形态的网络欺凌检测机器学习技术比较分析

Q1 Mathematics

International Journal of Computer Network and Information Security Pub Date : 2023-08-08 DOI:10.5815/ijcnis.2023.04.04

Sahana V., Anil Kumar K. M., Abdulbasit A. Darem

{"title":"基于FormSpring文本形态的网络欺凌检测机器学习技术比较分析","authors":"Sahana V., Anil Kumar K. M., Abdulbasit A. Darem","doi":"10.5815/ijcnis.2023.04.04","DOIUrl":null,"url":null,"abstract":"Social media usage has increased tremendously with the rise of the internet and it has evolved into the most powerful networking platform of the twenty-first century. However, a number of undesirable phenomena are associated with increased use of social networking, such as cyberbullying (CB), cybercrime, online abuse and online trolling. Especially for children and women, cyberbullying can have severe psychological and physical effects, even leading to self-harm or suicide. Because of its significant detrimental social impact, the detection of CB text or messages on social media has attracted more research work. To mitigate CB, we have proposed an automated cyberbullying detection model that detects and classifies cyberbullying content as either bullying or non-bullying (binary classification model), creating a more secure social media experience. The proposed model uses Natural Language Processing (NLP) techniques and Machine Learning (ML) approaches to assess cyberbullying contents. Our main goal is to assess different machine learning algorithms for their performance in cyberbullying detection based on a labelled dataset from Formspring [1]. Nine popular machine learning classifiers namely Bootstrap Aggregation or Bagging, Stochastic Gradient Descent (SGD), Random Forest (RF), Decision Tree (DT), Linear Support Vector Classifier (Linear SVC), Logistic Regression (LR), Adaptive Boosting (AdaBoost), Multinomial Naive Bayes (MNB) and K-Nearest Neighbour (KNN) are considered for the work. In addition, we have experimented with a feature extraction method namely CountVectorizer to obtain features that aid for better classification. The results show that the classification accuracy of AdaBoost classifier is 86.52% which is found better than all other machine learning algorithms used in this study. The proposed work demonstrates the effectiveness of machine learning algorithms in automatic cyberbullying detection as against the very intense and time-consuming approaches for the same problem, thereby by facilitating easy incorporation of an effective approach as tools across different platforms enabling people to use social media safely.","PeriodicalId":36488,"journal":{"name":"International Journal of Computer Network and Information Security","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Comparative Analysis of Machine Learning Techniques for Cyberbullying Detection on FormSpring in Textual Modality\",\"authors\":\"Sahana V., Anil Kumar K. M., Abdulbasit A. Darem\",\"doi\":\"10.5815/ijcnis.2023.04.04\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Social media usage has increased tremendously with the rise of the internet and it has evolved into the most powerful networking platform of the twenty-first century. However, a number of undesirable phenomena are associated with increased use of social networking, such as cyberbullying (CB), cybercrime, online abuse and online trolling. Especially for children and women, cyberbullying can have severe psychological and physical effects, even leading to self-harm or suicide. Because of its significant detrimental social impact, the detection of CB text or messages on social media has attracted more research work. To mitigate CB, we have proposed an automated cyberbullying detection model that detects and classifies cyberbullying content as either bullying or non-bullying (binary classification model), creating a more secure social media experience. The proposed model uses Natural Language Processing (NLP) techniques and Machine Learning (ML) approaches to assess cyberbullying contents. Our main goal is to assess different machine learning algorithms for their performance in cyberbullying detection based on a labelled dataset from Formspring [1]. Nine popular machine learning classifiers namely Bootstrap Aggregation or Bagging, Stochastic Gradient Descent (SGD), Random Forest (RF), Decision Tree (DT), Linear Support Vector Classifier (Linear SVC), Logistic Regression (LR), Adaptive Boosting (AdaBoost), Multinomial Naive Bayes (MNB) and K-Nearest Neighbour (KNN) are considered for the work. In addition, we have experimented with a feature extraction method namely CountVectorizer to obtain features that aid for better classification. The results show that the classification accuracy of AdaBoost classifier is 86.52% which is found better than all other machine learning algorithms used in this study. The proposed work demonstrates the effectiveness of machine learning algorithms in automatic cyberbullying detection as against the very intense and time-consuming approaches for the same problem, thereby by facilitating easy incorporation of an effective approach as tools across different platforms enabling people to use social media safely.\",\"PeriodicalId\":36488,\"journal\":{\"name\":\"International Journal of Computer Network and Information Security\",\"volume\":\"5 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-08-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Computer Network and Information Security\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5815/ijcnis.2023.04.04\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"Mathematics\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Computer Network and Information Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5815/ijcnis.2023.04.04","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Mathematics","Score":null,"Total":0}

引用次数: 0

摘要

随着互联网的兴起，社交媒体的使用急剧增加，它已经发展成为21世纪最强大的网络平台。然而，许多不良现象与社交网络使用的增加有关，如网络欺凌(CB)、网络犯罪、在线虐待和在线拖钓。尤其是对儿童和妇女来说，网络欺凌会产生严重的心理和身体影响，甚至导致自残或自杀。由于社交媒体上的CB文本或消息具有显著的社会危害性，因此其检测吸引了更多的研究工作。为了减轻CB，我们提出了一种自动网络欺凌检测模型，该模型可以检测并将网络欺凌内容分类为欺凌或非欺凌(二元分类模型)，从而创造更安全的社交媒体体验。该模型使用自然语言处理(NLP)技术和机器学习(ML)方法来评估网络欺凌内容。我们的主要目标是基于Formspring[1]的标记数据集评估不同的机器学习算法在网络欺凌检测中的性能。九种流行的机器学习分类器，即Bootstrap Aggregation或Bagging，随机梯度下降(SGD)，随机森林(RF)，决策树(DT)，线性支持向量分类器(Linear SVC)，逻辑回归(LR)，自适应增强(AdaBoost)，多项朴素贝叶斯(MNB)和k -近邻(KNN)被考虑用于工作。此外，我们还试验了一种特征提取方法，即CountVectorizer，以获得有助于更好分类的特征。结果表明，AdaBoost分类器的分类准确率为86.52%，优于本研究中使用的所有其他机器学习算法。拟议的工作证明了机器学习算法在自动网络欺凌检测方面的有效性，而不是针对同一问题的非常紧张和耗时的方法，从而促进了跨不同平台轻松整合有效方法作为工具，使人们能够安全地使用社交媒体。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Comparative Analysis of Machine Learning Techniques for Cyberbullying Detection on FormSpring in Textual Modality

Social media usage has increased tremendously with the rise of the internet and it has evolved into the most powerful networking platform of the twenty-first century. However, a number of undesirable phenomena are associated with increased use of social networking, such as cyberbullying (CB), cybercrime, online abuse and online trolling. Especially for children and women, cyberbullying can have severe psychological and physical effects, even leading to self-harm or suicide. Because of its significant detrimental social impact, the detection of CB text or messages on social media has attracted more research work. To mitigate CB, we have proposed an automated cyberbullying detection model that detects and classifies cyberbullying content as either bullying or non-bullying (binary classification model), creating a more secure social media experience. The proposed model uses Natural Language Processing (NLP) techniques and Machine Learning (ML) approaches to assess cyberbullying contents. Our main goal is to assess different machine learning algorithms for their performance in cyberbullying detection based on a labelled dataset from Formspring [1]. Nine popular machine learning classifiers namely Bootstrap Aggregation or Bagging, Stochastic Gradient Descent (SGD), Random Forest (RF), Decision Tree (DT), Linear Support Vector Classifier (Linear SVC), Logistic Regression (LR), Adaptive Boosting (AdaBoost), Multinomial Naive Bayes (MNB) and K-Nearest Neighbour (KNN) are considered for the work. In addition, we have experimented with a feature extraction method namely CountVectorizer to obtain features that aid for better classification. The results show that the classification accuracy of AdaBoost classifier is 86.52% which is found better than all other machine learning algorithms used in this study. The proposed work demonstrates the effectiveness of machine learning algorithms in automatic cyberbullying detection as against the very intense and time-consuming approaches for the same problem, thereby by facilitating easy incorporation of an effective approach as tools across different platforms enabling people to use social media safely.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of Computer Network and Information Security Social Sciences-Safety Research

CiteScore

4.10

自引率

0.00%

发文量