{"title":"Statistical machine learning for network intrusion detection: a data quality perspective","authors":"E. Lauría, G. Tayi","doi":"10.1504/IJSSCI.2008.019611","DOIUrl":null,"url":null,"abstract":"In this paper, we present our research in applying statistical machine learning methods for network intrusion detection. With the advent of online distributed services, the issue of preventing network intrusion and other forms of information security failures is gaining prominence. In this work, we use two different algorithms for classification (decision trees and naive Bayes classifier) to build predictive models capable of distinguishing between 'bad' TCP/IP connections, called intrusions attacks, and 'good' normal TCP/IP connections. We investigate the effect of training the models using both clean and dirty data. The goal is to analyse the predictive power of network intrusion classification models trained with data of varying quality. The classifiers are contrasted with a clustering-based approach for comparison purposes.","PeriodicalId":365774,"journal":{"name":"International Journal of Services Sciences","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Services Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1504/IJSSCI.2008.019611","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9
Abstract
In this paper, we present our research in applying statistical machine learning methods for network intrusion detection. With the advent of online distributed services, the issue of preventing network intrusion and other forms of information security failures is gaining prominence. In this work, we use two different algorithms for classification (decision trees and naive Bayes classifier) to build predictive models capable of distinguishing between 'bad' TCP/IP connections, called intrusions attacks, and 'good' normal TCP/IP connections. We investigate the effect of training the models using both clean and dirty data. The goal is to analyse the predictive power of network intrusion classification models trained with data of varying quality. The classifiers are contrasted with a clustering-based approach for comparison purposes.