评估网络异常检测中机器学习模型的性能和挑战

International Journal of Scientific Research in Science, Engineering and Technology Pub Date : 2024-05-12 DOI:10.32628/ijsrset5241134

Sakshi Bakhare, Dr. Sudhir W. Mohod

{"title":"评估网络异常检测中机器学习模型的性能和挑战","authors":"Sakshi Bakhare, Dr. Sudhir W. Mohod","doi":"10.32628/ijsrset5241134","DOIUrl":null,"url":null,"abstract":"The application of machine learning algorithms for anomaly detection in network traffic data is examined in this study. Using a collection of network flow records that includes attributes such as IP addresses, ports, protocols, and timestamps, the study makes use of correlation heatmaps, box plots, and data visualization to identify trends in numerical characteristics. After preprocessing, which includes timestamp conversion to Unix format, three machine learning models Support Vector Machine (SVM), Gaussian Naive Bayes, and Random Forest are used for anomaly identification. The Random Forest Classifier outperforms SVM and Naive Bayes classifiers with better precision and recall for anomaly diagnosis, achieving an accuracy of 87%. Confusion matrices and classification reports are used to evaluate the models, and they show that the Random Forest Classifier performs better than the other models in identifying abnormalities in network traffic. These results provide significant value to the field of cybersecurity by highlighting the effectiveness of machine learning models specifically, the Random Forest Classifier in boosting anomaly detection capacities for network environment security.","PeriodicalId":14228,"journal":{"name":"International Journal of Scientific Research in Science, Engineering and Technology","volume":"103 49","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Evaluating the Performance and Challenges of Machine Learning Models in Network Anomaly Detection\",\"authors\":\"Sakshi Bakhare, Dr. Sudhir W. Mohod\",\"doi\":\"10.32628/ijsrset5241134\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The application of machine learning algorithms for anomaly detection in network traffic data is examined in this study. Using a collection of network flow records that includes attributes such as IP addresses, ports, protocols, and timestamps, the study makes use of correlation heatmaps, box plots, and data visualization to identify trends in numerical characteristics. After preprocessing, which includes timestamp conversion to Unix format, three machine learning models Support Vector Machine (SVM), Gaussian Naive Bayes, and Random Forest are used for anomaly identification. The Random Forest Classifier outperforms SVM and Naive Bayes classifiers with better precision and recall for anomaly diagnosis, achieving an accuracy of 87%. Confusion matrices and classification reports are used to evaluate the models, and they show that the Random Forest Classifier performs better than the other models in identifying abnormalities in network traffic. These results provide significant value to the field of cybersecurity by highlighting the effectiveness of machine learning models specifically, the Random Forest Classifier in boosting anomaly detection capacities for network environment security.\",\"PeriodicalId\":14228,\"journal\":{\"name\":\"International Journal of Scientific Research in Science, Engineering and Technology\",\"volume\":\"103 49\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-05-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Scientific Research in Science, Engineering and Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.32628/ijsrset5241134\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Scientific Research in Science, Engineering and Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.32628/ijsrset5241134","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

本研究探讨了机器学习算法在网络流量数据异常检测中的应用。该研究使用网络流量记录集合（包括 IP 地址、端口、协议和时间戳等属性），利用相关热图、箱形图和数据可视化来识别数字特征的趋势。经过预处理（包括将时间戳转换为 Unix 格式）后，三种机器学习模型支持向量机 (SVM)、高斯直觉贝叶斯 (Gaussian Naive Bayes) 和随机森林 (Random Forest) 被用于异常识别。在异常诊断方面，随机森林分类器的精度和召回率均优于 SVM 和 Naive Bayes 分类器，准确率达到 87%。混淆矩阵和分类报告用于对模型进行评估，结果表明随机森林分类器在识别网络流量异常方面的表现优于其他模型。这些结果为网络安全领域提供了重要价值，凸显了机器学习模型（特别是随机森林分类器）在提高网络环境安全异常检测能力方面的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Evaluating the Performance and Challenges of Machine Learning Models in Network Anomaly Detection

The application of machine learning algorithms for anomaly detection in network traffic data is examined in this study. Using a collection of network flow records that includes attributes such as IP addresses, ports, protocols, and timestamps, the study makes use of correlation heatmaps, box plots, and data visualization to identify trends in numerical characteristics. After preprocessing, which includes timestamp conversion to Unix format, three machine learning models Support Vector Machine (SVM), Gaussian Naive Bayes, and Random Forest are used for anomaly identification. The Random Forest Classifier outperforms SVM and Naive Bayes classifiers with better precision and recall for anomaly diagnosis, achieving an accuracy of 87%. Confusion matrices and classification reports are used to evaluate the models, and they show that the Random Forest Classifier performs better than the other models in identifying abnormalities in network traffic. These results provide significant value to the field of cybersecurity by highlighting the effectiveness of machine learning models specifically, the Random Forest Classifier in boosting anomaly detection capacities for network environment security.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of Scientific Research in Science, Engineering and Technology

自引率

0.00%

发文量