Network Anomaly Detection Using LightGBM: A Gradient Boosting Classifier

Md. Khairul Islam, Prithula Hridi, Md. Shohrab Hossain, Husnu S. Narman
{"title":"Network Anomaly Detection Using LightGBM: A Gradient Boosting Classifier","authors":"Md. Khairul Islam, Prithula Hridi, Md. Shohrab Hossain, Husnu S. Narman","doi":"10.1109/ITNAC50341.2020.9315049","DOIUrl":null,"url":null,"abstract":"Anomaly detection systems are significant in recognizing intruders or suspicious activities by detecting unseen and unknown attacks. In this paper, we have worked on a benchmark network anomaly detection dataset UNSW-NB15, that reflects modern-day network traffic. Previous works on this dataset either lacked a proper validation approach or followed only one evaluation setup which made it difficult to compare their contributions with others using the same dataset but with different validation steps. In this paper, we have used a machine learning classifier LightGBM to perform binary classification on this dataset. We have presented a thorough study of the dataset with feature engineering, preprocessing, feature selection. We have evaluated the performance of our model using different experimental setups (used in several previous works) to clearly evaluate and compare with others. Using ten-fold cross-validation on the train, test, and combined (training and test) dataset, our model has achieved 97.21%, 98.33%, and 96.21% f1_scores, respectively. Also, the model fitted only on train data, achieved 92.96% f1_score on the separate test data. So our model also provides significant performance on unseen data. We have presented complete comparisons with the prior arts using all performance metrics available on them. And we have also shown that our model outperformed them in most metrics and thus can detect network anomalies better.","PeriodicalId":131639,"journal":{"name":"2020 30th International Telecommunication Networks and Applications Conference (ITNAC)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 30th International Telecommunication Networks and Applications Conference (ITNAC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITNAC50341.2020.9315049","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

Abstract

Anomaly detection systems are significant in recognizing intruders or suspicious activities by detecting unseen and unknown attacks. In this paper, we have worked on a benchmark network anomaly detection dataset UNSW-NB15, that reflects modern-day network traffic. Previous works on this dataset either lacked a proper validation approach or followed only one evaluation setup which made it difficult to compare their contributions with others using the same dataset but with different validation steps. In this paper, we have used a machine learning classifier LightGBM to perform binary classification on this dataset. We have presented a thorough study of the dataset with feature engineering, preprocessing, feature selection. We have evaluated the performance of our model using different experimental setups (used in several previous works) to clearly evaluate and compare with others. Using ten-fold cross-validation on the train, test, and combined (training and test) dataset, our model has achieved 97.21%, 98.33%, and 96.21% f1_scores, respectively. Also, the model fitted only on train data, achieved 92.96% f1_score on the separate test data. So our model also provides significant performance on unseen data. We have presented complete comparisons with the prior arts using all performance metrics available on them. And we have also shown that our model outperformed them in most metrics and thus can detect network anomalies better.
基于LightGBM的网络异常检测:梯度增强分类器
异常检测系统通过检测不可见和未知的攻击,在识别入侵者或可疑活动方面具有重要意义。在本文中,我们研究了一个反映现代网络流量的基准网络异常检测数据集UNSW-NB15。先前对该数据集的研究要么缺乏适当的验证方法,要么只遵循一种评估设置,这使得很难将他们的贡献与使用相同数据集但验证步骤不同的其他人进行比较。在本文中,我们使用机器学习分类器LightGBM对该数据集进行二值分类。我们从特征工程、预处理、特征选择等方面对数据集进行了深入的研究。我们使用不同的实验设置(在以前的几项工作中使用)来评估我们的模型的性能,以便清楚地评估和比较其他模型。在训练、测试和组合(训练和测试)数据集上使用十倍交叉验证,我们的模型分别达到了97.21%、98.33%和96.21%的f1_scores。同时,该模型仅对列车数据进行拟合,在单独的测试数据上达到了92.96%的f1_score。因此,我们的模型在未见过的数据上也提供了显著的性能。我们使用所有可用的性能指标,与现有技术进行了完整的比较。我们还表明,我们的模型在大多数指标上都优于它们,因此可以更好地检测网络异常。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信