{"title":"Network Intrusion Detection Model Based on PCA + ADASYN and XGBoost","authors":"Leilei Pan, X. Xie","doi":"10.1145/3453187.3453311","DOIUrl":null,"url":null,"abstract":"Due to the class-imbalance and redundancy of sample features, the network intrusion detection model based on classification algorithm has high false positive rate (FPR) for minority sample. A network intrusion detection model based on PCA + ADASYN and XGBoost is proposed. The principal component analysis (PCA) algorithm is used to reduce the redundancy features of the data. On this basis, the adaptive synthetic sampling (ADASYN) algorithm is used to oversample minority sample to solve the problem of class-imbalanced at the data level. Finally, XGBoost is used as a classifier to classify the detected data. In order to verify the validity of the model, several groups of comparative experiments were carried out on KDD CUP99 data set. The FPR of the proposed model for minority samples (r2l, u2r) were 17.3% and 19.7%, and the F1 were 90.1% and 84.5%. The experimental results show that by dealing with the problem of data redundancy and class-imbalanced, we can reduce the FPR of the detection model for minority sample and improve the F1.","PeriodicalId":208580,"journal":{"name":"Proceedings of the 2020 3rd International Conference on E-Business, Information Management and Computer Science","volume":"61 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2020 3rd International Conference on E-Business, Information Management and Computer Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3453187.3453311","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Due to the class-imbalance and redundancy of sample features, the network intrusion detection model based on classification algorithm has high false positive rate (FPR) for minority sample. A network intrusion detection model based on PCA + ADASYN and XGBoost is proposed. The principal component analysis (PCA) algorithm is used to reduce the redundancy features of the data. On this basis, the adaptive synthetic sampling (ADASYN) algorithm is used to oversample minority sample to solve the problem of class-imbalanced at the data level. Finally, XGBoost is used as a classifier to classify the detected data. In order to verify the validity of the model, several groups of comparative experiments were carried out on KDD CUP99 data set. The FPR of the proposed model for minority samples (r2l, u2r) were 17.3% and 19.7%, and the F1 were 90.1% and 84.5%. The experimental results show that by dealing with the problem of data redundancy and class-imbalanced, we can reduce the FPR of the detection model for minority sample and improve the F1.