{"title":"网络入侵检测的统计机器学习:数据质量视角","authors":"E. Lauría, G. Tayi","doi":"10.1504/IJSSCI.2008.019611","DOIUrl":null,"url":null,"abstract":"In this paper, we present our research in applying statistical machine learning methods for network intrusion detection. With the advent of online distributed services, the issue of preventing network intrusion and other forms of information security failures is gaining prominence. In this work, we use two different algorithms for classification (decision trees and naive Bayes classifier) to build predictive models capable of distinguishing between 'bad' TCP/IP connections, called intrusions attacks, and 'good' normal TCP/IP connections. We investigate the effect of training the models using both clean and dirty data. The goal is to analyse the predictive power of network intrusion classification models trained with data of varying quality. The classifiers are contrasted with a clustering-based approach for comparison purposes.","PeriodicalId":365774,"journal":{"name":"International Journal of Services Sciences","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Statistical machine learning for network intrusion detection: a data quality perspective\",\"authors\":\"E. Lauría, G. Tayi\",\"doi\":\"10.1504/IJSSCI.2008.019611\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we present our research in applying statistical machine learning methods for network intrusion detection. With the advent of online distributed services, the issue of preventing network intrusion and other forms of information security failures is gaining prominence. In this work, we use two different algorithms for classification (decision trees and naive Bayes classifier) to build predictive models capable of distinguishing between 'bad' TCP/IP connections, called intrusions attacks, and 'good' normal TCP/IP connections. We investigate the effect of training the models using both clean and dirty data. The goal is to analyse the predictive power of network intrusion classification models trained with data of varying quality. The classifiers are contrasted with a clustering-based approach for comparison purposes.\",\"PeriodicalId\":365774,\"journal\":{\"name\":\"International Journal of Services Sciences\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-07-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Services Sciences\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1504/IJSSCI.2008.019611\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Services Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1504/IJSSCI.2008.019611","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Statistical machine learning for network intrusion detection: a data quality perspective
In this paper, we present our research in applying statistical machine learning methods for network intrusion detection. With the advent of online distributed services, the issue of preventing network intrusion and other forms of information security failures is gaining prominence. In this work, we use two different algorithms for classification (decision trees and naive Bayes classifier) to build predictive models capable of distinguishing between 'bad' TCP/IP connections, called intrusions attacks, and 'good' normal TCP/IP connections. We investigate the effect of training the models using both clean and dirty data. The goal is to analyse the predictive power of network intrusion classification models trained with data of varying quality. The classifiers are contrasted with a clustering-based approach for comparison purposes.