{"title":"Network Traffic Anomaly Detection Based on Information Gain and Deep Learning","authors":"Xianglin Lu, Pengju Liu, Jiayi Lin","doi":"10.1145/3325917.3325946","DOIUrl":null,"url":null,"abstract":"With the rapid development of the Internet, the network traffic shows an explosive growth trend. Although the Internet facilitates people's lives, it also brings a lot of security threats. Thus, the analysis of abnormal behavior of network traffic becomes a crucial factor for ensuring the quality of Internet services and preventing network intrusion. This paper proposes a deep learning method that combines CNN and LSTM to detect abnormal network traffic, especially unknown intrusions. In the field of machine learning, the choice of features is the key ingredient to the effect and accuracy of the model. Therefore, this paper also proposes a feature selection method based on Information Gain (IG), extracting more valuable features, which are fed into the model. We use CNN to extract the higher dimensional features of the input data, and then use LSTM to learn the timing characteristics of the network traffic. We applied our model on the KDD99 dataset and assessed its accuracy. When the epoch greater than 4, the training accuracy reaches 0.99 and testing accuracy reaches 0.925, which showed a certain improvement compared with the traditional model. In the era when information volume is becoming more and more dense, the analysis of network traffic will become more and more necessary, which also proves broader application prospects.","PeriodicalId":249061,"journal":{"name":"Proceedings of the 2019 3rd International Conference on Information System and Data Mining","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2019 3rd International Conference on Information System and Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3325917.3325946","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9
Abstract
With the rapid development of the Internet, the network traffic shows an explosive growth trend. Although the Internet facilitates people's lives, it also brings a lot of security threats. Thus, the analysis of abnormal behavior of network traffic becomes a crucial factor for ensuring the quality of Internet services and preventing network intrusion. This paper proposes a deep learning method that combines CNN and LSTM to detect abnormal network traffic, especially unknown intrusions. In the field of machine learning, the choice of features is the key ingredient to the effect and accuracy of the model. Therefore, this paper also proposes a feature selection method based on Information Gain (IG), extracting more valuable features, which are fed into the model. We use CNN to extract the higher dimensional features of the input data, and then use LSTM to learn the timing characteristics of the network traffic. We applied our model on the KDD99 dataset and assessed its accuracy. When the epoch greater than 4, the training accuracy reaches 0.99 and testing accuracy reaches 0.925, which showed a certain improvement compared with the traditional model. In the era when information volume is becoming more and more dense, the analysis of network traffic will become more and more necessary, which also proves broader application prospects.
随着互联网的快速发展,网络流量呈现爆发式增长趋势。虽然互联网方便了人们的生活,但它也带来了很多安全威胁。因此,分析网络流量的异常行为成为保证互联网服务质量和防止网络入侵的关键因素。本文提出了一种结合CNN和LSTM的深度学习方法来检测异常网络流量,特别是未知入侵。在机器学习领域,特征的选择是影响模型效果和准确性的关键因素。因此,本文还提出了一种基于信息增益(Information Gain, IG)的特征选择方法,提取更多有价值的特征,并将其输入到模型中。我们使用CNN提取输入数据的高维特征,然后使用LSTM学习网络流量的时序特征。我们将我们的模型应用于KDD99数据集并评估了它的准确性。当epoch大于4时,训练精度达到0.99,测试精度达到0.925,与传统模型相比有一定的提高。在信息量越来越密集的时代,对网络流量的分析将越来越有必要,这也证明了更广阔的应用前景。