评估网络异常检测中机器学习模型的性能和挑战

Sakshi Bakhare, Dr. Sudhir W. Mohod
{"title":"评估网络异常检测中机器学习模型的性能和挑战","authors":"Sakshi Bakhare, Dr. Sudhir W. Mohod","doi":"10.32628/ijsrset5241134","DOIUrl":null,"url":null,"abstract":"The application of machine learning algorithms for anomaly detection in network traffic data is examined in this study. Using a collection of network flow records that includes attributes such as IP addresses, ports, protocols, and timestamps, the study makes use of correlation heatmaps, box plots, and data visualization to identify trends in numerical characteristics. After preprocessing, which includes timestamp conversion to Unix format, three machine learning models Support Vector Machine (SVM), Gaussian Naive Bayes, and Random Forest are used for anomaly identification. The Random Forest Classifier outperforms SVM and Naive Bayes classifiers with better precision and recall for anomaly diagnosis, achieving an accuracy of 87%. Confusion matrices and classification reports are used to evaluate the models, and they show that the Random Forest Classifier performs better than the other models in identifying abnormalities in network traffic. These results provide significant value to the field of cybersecurity by highlighting the effectiveness of machine learning models specifically, the Random Forest Classifier in boosting anomaly detection capacities for network environment security.","PeriodicalId":14228,"journal":{"name":"International Journal of Scientific Research in Science, Engineering and Technology","volume":"103 49","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Evaluating the Performance and Challenges of Machine Learning Models in Network Anomaly Detection\",\"authors\":\"Sakshi Bakhare, Dr. Sudhir W. Mohod\",\"doi\":\"10.32628/ijsrset5241134\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The application of machine learning algorithms for anomaly detection in network traffic data is examined in this study. Using a collection of network flow records that includes attributes such as IP addresses, ports, protocols, and timestamps, the study makes use of correlation heatmaps, box plots, and data visualization to identify trends in numerical characteristics. After preprocessing, which includes timestamp conversion to Unix format, three machine learning models Support Vector Machine (SVM), Gaussian Naive Bayes, and Random Forest are used for anomaly identification. The Random Forest Classifier outperforms SVM and Naive Bayes classifiers with better precision and recall for anomaly diagnosis, achieving an accuracy of 87%. Confusion matrices and classification reports are used to evaluate the models, and they show that the Random Forest Classifier performs better than the other models in identifying abnormalities in network traffic. These results provide significant value to the field of cybersecurity by highlighting the effectiveness of machine learning models specifically, the Random Forest Classifier in boosting anomaly detection capacities for network environment security.\",\"PeriodicalId\":14228,\"journal\":{\"name\":\"International Journal of Scientific Research in Science, Engineering and Technology\",\"volume\":\"103 49\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-05-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Scientific Research in Science, Engineering and Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.32628/ijsrset5241134\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Scientific Research in Science, Engineering and Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.32628/ijsrset5241134","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

本研究探讨了机器学习算法在网络流量数据异常检测中的应用。该研究使用网络流量记录集合(包括 IP 地址、端口、协议和时间戳等属性),利用相关热图、箱形图和数据可视化来识别数字特征的趋势。经过预处理(包括将时间戳转换为 Unix 格式)后,三种机器学习模型支持向量机 (SVM)、高斯直觉贝叶斯 (Gaussian Naive Bayes) 和随机森林 (Random Forest) 被用于异常识别。在异常诊断方面,随机森林分类器的精度和召回率均优于 SVM 和 Naive Bayes 分类器,准确率达到 87%。混淆矩阵和分类报告用于对模型进行评估,结果表明随机森林分类器在识别网络流量异常方面的表现优于其他模型。这些结果为网络安全领域提供了重要价值,凸显了机器学习模型(特别是随机森林分类器)在提高网络环境安全异常检测能力方面的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Evaluating the Performance and Challenges of Machine Learning Models in Network Anomaly Detection
The application of machine learning algorithms for anomaly detection in network traffic data is examined in this study. Using a collection of network flow records that includes attributes such as IP addresses, ports, protocols, and timestamps, the study makes use of correlation heatmaps, box plots, and data visualization to identify trends in numerical characteristics. After preprocessing, which includes timestamp conversion to Unix format, three machine learning models Support Vector Machine (SVM), Gaussian Naive Bayes, and Random Forest are used for anomaly identification. The Random Forest Classifier outperforms SVM and Naive Bayes classifiers with better precision and recall for anomaly diagnosis, achieving an accuracy of 87%. Confusion matrices and classification reports are used to evaluate the models, and they show that the Random Forest Classifier performs better than the other models in identifying abnormalities in network traffic. These results provide significant value to the field of cybersecurity by highlighting the effectiveness of machine learning models specifically, the Random Forest Classifier in boosting anomaly detection capacities for network environment security.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信