{"title":"Implementation of Ensemble Learning and Feature Selection for Performance Improvements in Anomaly-Based Intrusion Detection Systems","authors":"Qusyairi Ridho Saeful Fitni, K. Ramli","doi":"10.1109/IAICT50021.2020.9172014","DOIUrl":null,"url":null,"abstract":"In recent years, data security in organizational information systems has become a serious concern. Many attacks are becoming less detectable by firewall and antivirus software. To improve security, intrusion detection systems (IDSs) are used to detect anomalies in network traffic. Currently, IDS technology has performance issues regarding detection accuracy, detection times, false alarm notifications, and unknown attack detection. Several studies have applied machine-learning approaches as solutions. This study used an ensemble learning approach that integrates the benefits of each single detection algorithms. We made comparisons with seven single classifiers to identify the most appropriate basic classifiers for ensemble learning. The experiment shows logistics regression, decision trees, and gradient boosting are chosen for our ensemble model. The Communications Security Establishment and Canadian Institute for Cybersecurity 2018 (CSE-CIC-IDS2018) dataset was used to evaluate the proposed model. Spearman’s rank correlation coefficient facilitated the identification of the data features that might not be used. The experiment results showed that 23 of the 80 features were selected, and the model achieved the following scores: final accuracy, 98.8%; precision, 98.8%; recall, 97.1%; and F1, 97.9%.","PeriodicalId":433718,"journal":{"name":"2020 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology (IAICT)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"53","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology (IAICT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IAICT50021.2020.9172014","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 53
Abstract
In recent years, data security in organizational information systems has become a serious concern. Many attacks are becoming less detectable by firewall and antivirus software. To improve security, intrusion detection systems (IDSs) are used to detect anomalies in network traffic. Currently, IDS technology has performance issues regarding detection accuracy, detection times, false alarm notifications, and unknown attack detection. Several studies have applied machine-learning approaches as solutions. This study used an ensemble learning approach that integrates the benefits of each single detection algorithms. We made comparisons with seven single classifiers to identify the most appropriate basic classifiers for ensemble learning. The experiment shows logistics regression, decision trees, and gradient boosting are chosen for our ensemble model. The Communications Security Establishment and Canadian Institute for Cybersecurity 2018 (CSE-CIC-IDS2018) dataset was used to evaluate the proposed model. Spearman’s rank correlation coefficient facilitated the identification of the data features that might not be used. The experiment results showed that 23 of the 80 features were selected, and the model achieved the following scores: final accuracy, 98.8%; precision, 98.8%; recall, 97.1%; and F1, 97.9%.