Duc M Cao, Md Abu Sayed, Md Tuhin Mia, Eftekhar Hossain Ayon, Bishnu Padh Ghosh, Rejon Kumar Ray, Aqib Raihan, Aslima Akter, Mamunur Rahman
{"title":"高级网络犯罪检测:利用真实世界数据集全面研究监督和非监督机器学习方法","authors":"Duc M Cao, Md Abu Sayed, Md Tuhin Mia, Eftekhar Hossain Ayon, Bishnu Padh Ghosh, Rejon Kumar Ray, Aqib Raihan, Aslima Akter, Mamunur Rahman","doi":"10.32996/jcsts.2024.6.1.5","DOIUrl":null,"url":null,"abstract":"In the ever-evolving field of cybersecurity, sophisticated methods—which combine supervised and unsupervised approaches—are used to tackle cybercrime. Strong supervised tools include Support Vector Machines (SVM) and K-Nearest Neighbors (KNN), while well-known unsupervised methods include the K-means clustering model. These techniques are used on the publicly available StatLine dataset from CBS, which is a large dataset that includes the individual attributes of one thousand crime victims. Performance analysis shows the remarkable 91% accuracy of SVM in supervised classification by examining the differences between training and testing data. K-Nearest Neighbors (KNN) models are quite good in the unsupervised arena; their accuracy in detecting criminal activity is impressive, at 79.56%. Strong assessment metrics, such as False Positive (FP), True Negative (TN), False Negative (FN), False Positive (TP), and False Alarm Rate (FAR), Detection Rate (DR), Accuracy (ACC), Recall, Precision, Specificity, Sensitivity, and Fowlkes–Mallow's scores, provide a comprehensive assessment.","PeriodicalId":417206,"journal":{"name":"Journal of Computer Science and Technology Studies","volume":"140 22","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Advanced Cybercrime Detection: A Comprehensive Study on Supervised and Unsupervised Machine Learning Approaches Using Real-world Datasets\",\"authors\":\"Duc M Cao, Md Abu Sayed, Md Tuhin Mia, Eftekhar Hossain Ayon, Bishnu Padh Ghosh, Rejon Kumar Ray, Aqib Raihan, Aslima Akter, Mamunur Rahman\",\"doi\":\"10.32996/jcsts.2024.6.1.5\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the ever-evolving field of cybersecurity, sophisticated methods—which combine supervised and unsupervised approaches—are used to tackle cybercrime. Strong supervised tools include Support Vector Machines (SVM) and K-Nearest Neighbors (KNN), while well-known unsupervised methods include the K-means clustering model. These techniques are used on the publicly available StatLine dataset from CBS, which is a large dataset that includes the individual attributes of one thousand crime victims. Performance analysis shows the remarkable 91% accuracy of SVM in supervised classification by examining the differences between training and testing data. K-Nearest Neighbors (KNN) models are quite good in the unsupervised arena; their accuracy in detecting criminal activity is impressive, at 79.56%. Strong assessment metrics, such as False Positive (FP), True Negative (TN), False Negative (FN), False Positive (TP), and False Alarm Rate (FAR), Detection Rate (DR), Accuracy (ACC), Recall, Precision, Specificity, Sensitivity, and Fowlkes–Mallow's scores, provide a comprehensive assessment.\",\"PeriodicalId\":417206,\"journal\":{\"name\":\"Journal of Computer Science and Technology Studies\",\"volume\":\"140 22\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-01-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Computer Science and Technology Studies\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.32996/jcsts.2024.6.1.5\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computer Science and Technology Studies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.32996/jcsts.2024.6.1.5","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Advanced Cybercrime Detection: A Comprehensive Study on Supervised and Unsupervised Machine Learning Approaches Using Real-world Datasets
In the ever-evolving field of cybersecurity, sophisticated methods—which combine supervised and unsupervised approaches—are used to tackle cybercrime. Strong supervised tools include Support Vector Machines (SVM) and K-Nearest Neighbors (KNN), while well-known unsupervised methods include the K-means clustering model. These techniques are used on the publicly available StatLine dataset from CBS, which is a large dataset that includes the individual attributes of one thousand crime victims. Performance analysis shows the remarkable 91% accuracy of SVM in supervised classification by examining the differences between training and testing data. K-Nearest Neighbors (KNN) models are quite good in the unsupervised arena; their accuracy in detecting criminal activity is impressive, at 79.56%. Strong assessment metrics, such as False Positive (FP), True Negative (TN), False Negative (FN), False Positive (TP), and False Alarm Rate (FAR), Detection Rate (DR), Accuracy (ACC), Recall, Precision, Specificity, Sensitivity, and Fowlkes–Mallow's scores, provide a comprehensive assessment.