特征选择技术对网络跟踪检测的影响分析

Kumar Gautam Arvind, Bansal Abhishek
{"title":"特征选择技术对网络跟踪检测的影响分析","authors":"Kumar Gautam Arvind, Bansal Abhishek","doi":"10.26634/jip.9.4.19138","DOIUrl":null,"url":null,"abstract":"Internet-based applications are making the habitual society and exploring new ways to perform online-based crimes. Numerous cybercriminals are engaged in the different platforms of the internet-based virtual world, carrying out cybercrimes according to predetermined and preplanned agendas. As technology advances, cyberstalking, cyberbullying, and other forms of cyber harassment are growing on social media, email, and other online platforms. Cyberstalking uses internet-based technology to harass, intimidate, and undermine individuals online with different approaches. In order to examine the impact of feature selection strategies for improving model performance, this paper proposes a machine learning-based cyberstalking detection model. The proposed model used the Term Frequency-Inverse Document Frequency (TF-IDF) feature extraction method to extract features, and three distinct approaches, TF-IDF + Chi-Square Test, and TF-IDF + Information Gain, were used to select the different numbers of relevant features. In the cyberstalking detection model, a Support Vector Machine (SVM) was employed for classification purposes. Based on the SVM classifier's performance, each feature selection approach's impact on the various feature sets was assessed. According to experimental findings, the TF-IDF + Chi-Square Test outperformed other applied approaches and improved detection mode performance. Additionally, experimental findings demonstrate that the TFIDF + Chi-Square Test approach also performs better in a small collection of relevant features than other approaches that have been utilized.","PeriodicalId":292215,"journal":{"name":"i-manager’s Journal on Image Processing","volume":"57 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Impact analysis of feature selection techniques on cyberstalking detection\",\"authors\":\"Kumar Gautam Arvind, Bansal Abhishek\",\"doi\":\"10.26634/jip.9.4.19138\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Internet-based applications are making the habitual society and exploring new ways to perform online-based crimes. Numerous cybercriminals are engaged in the different platforms of the internet-based virtual world, carrying out cybercrimes according to predetermined and preplanned agendas. As technology advances, cyberstalking, cyberbullying, and other forms of cyber harassment are growing on social media, email, and other online platforms. Cyberstalking uses internet-based technology to harass, intimidate, and undermine individuals online with different approaches. In order to examine the impact of feature selection strategies for improving model performance, this paper proposes a machine learning-based cyberstalking detection model. The proposed model used the Term Frequency-Inverse Document Frequency (TF-IDF) feature extraction method to extract features, and three distinct approaches, TF-IDF + Chi-Square Test, and TF-IDF + Information Gain, were used to select the different numbers of relevant features. In the cyberstalking detection model, a Support Vector Machine (SVM) was employed for classification purposes. Based on the SVM classifier's performance, each feature selection approach's impact on the various feature sets was assessed. According to experimental findings, the TF-IDF + Chi-Square Test outperformed other applied approaches and improved detection mode performance. Additionally, experimental findings demonstrate that the TFIDF + Chi-Square Test approach also performs better in a small collection of relevant features than other approaches that have been utilized.\",\"PeriodicalId\":292215,\"journal\":{\"name\":\"i-manager’s Journal on Image Processing\",\"volume\":\"57 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"i-manager’s Journal on Image Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.26634/jip.9.4.19138\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"i-manager’s Journal on Image Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.26634/jip.9.4.19138","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

基于互联网的应用正在形成习惯性社会,并探索网络犯罪的新方式。众多网络犯罪分子在基于互联网的虚拟世界的不同平台上活动,根据预定和预先计划的议程进行网络犯罪。随着科技的进步,网络跟踪、网络欺凌和其他形式的网络骚扰在社交媒体、电子邮件和其他在线平台上越来越多。网络跟踪使用基于互联网的技术,以不同的方式在网上骚扰、恐吓和破坏个人。为了检验特征选择策略对提高模型性能的影响,本文提出了一种基于机器学习的网络跟踪检测模型。该模型采用术语频率-逆文档频率(TF-IDF)特征提取方法提取特征,并采用TF-IDF +卡方检验和TF-IDF +信息增益三种不同的方法选择不同数量的相关特征。在网络跟踪检测模型中,采用支持向量机(SVM)进行分类。基于SVM分类器的性能,评估了每种特征选择方法对各种特征集的影响。实验结果表明,TF-IDF +卡方检验优于其他应用方法,提高了检测模式的性能。此外,实验结果表明,TFIDF +卡方检验方法在小范围的相关特征集合中也比使用的其他方法表现更好。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Impact analysis of feature selection techniques on cyberstalking detection
Internet-based applications are making the habitual society and exploring new ways to perform online-based crimes. Numerous cybercriminals are engaged in the different platforms of the internet-based virtual world, carrying out cybercrimes according to predetermined and preplanned agendas. As technology advances, cyberstalking, cyberbullying, and other forms of cyber harassment are growing on social media, email, and other online platforms. Cyberstalking uses internet-based technology to harass, intimidate, and undermine individuals online with different approaches. In order to examine the impact of feature selection strategies for improving model performance, this paper proposes a machine learning-based cyberstalking detection model. The proposed model used the Term Frequency-Inverse Document Frequency (TF-IDF) feature extraction method to extract features, and three distinct approaches, TF-IDF + Chi-Square Test, and TF-IDF + Information Gain, were used to select the different numbers of relevant features. In the cyberstalking detection model, a Support Vector Machine (SVM) was employed for classification purposes. Based on the SVM classifier's performance, each feature selection approach's impact on the various feature sets was assessed. According to experimental findings, the TF-IDF + Chi-Square Test outperformed other applied approaches and improved detection mode performance. Additionally, experimental findings demonstrate that the TFIDF + Chi-Square Test approach also performs better in a small collection of relevant features than other approaches that have been utilized.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信