{"title":"Analysis of data pre-processing influence on intrusion detection using NSL-KDD dataset","authors":"N. Paulauskas, Juozas Auskalnis","doi":"10.1109/ESTREAM.2017.7950325","DOIUrl":null,"url":null,"abstract":"Data pre-processing for machine learning methods is key step for knowledge discovery process. Depending on nature of the data, pre-processing might take the majority time of data analysis. Correctly prepared data for processing guarantees precise and reliable results of data analysis. This paper analyses initial data pre-processing influence to attack detection accuracy by using Decision Trees, Naïve Bayes and Rule-Based classifiers with NSL-KDD dataset. In addition, the results of detected attacks accuracy dependency by selecting different attacks grouping options and using ensembles of various classifiers are presented.","PeriodicalId":174077,"journal":{"name":"2017 Open Conference of Electrical, Electronic and Information Sciences (eStream)","volume":"4 12","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"56","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 Open Conference of Electrical, Electronic and Information Sciences (eStream)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ESTREAM.2017.7950325","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 56
Abstract
Data pre-processing for machine learning methods is key step for knowledge discovery process. Depending on nature of the data, pre-processing might take the majority time of data analysis. Correctly prepared data for processing guarantees precise and reliable results of data analysis. This paper analyses initial data pre-processing influence to attack detection accuracy by using Decision Trees, Naïve Bayes and Rule-Based classifiers with NSL-KDD dataset. In addition, the results of detected attacks accuracy dependency by selecting different attacks grouping options and using ensembles of various classifiers are presented.