Consensus hybrid ensemble machine learning for intrusion detection with explainable AI

IF 7.7 2区计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Journal of Network and Computer Applications Pub Date : 2024-12-13 DOI:10.1016/j.jnca.2024.104091

Usman Ahmed , Zheng Jiangbin , Sheharyar Khan , Muhammad Tariq Sadiq

{"title":"Consensus hybrid ensemble machine learning for intrusion detection with explainable AI","authors":"Usman Ahmed , Zheng Jiangbin , Sheharyar Khan , Muhammad Tariq Sadiq","doi":"10.1016/j.jnca.2024.104091","DOIUrl":null,"url":null,"abstract":"<div><div>Intrusion detection systems (IDSs) are dynamic to cybersecurity because they protect computer networks from malicious activity. IDS can benefit from machine learning; however, individual models may be unable to handle sophisticated and dynamic threats. Current cutting-edge research frequently concentrates on single machine-learning models for intrusion detection. They do not emphasize the necessity for more flexible and effective alternatives. The current computer network identification design techniques often need to improve efficiency and interpretability. Techniques that allow different models to operate together and adjust to dynamic network settings are required. This research addresses this gap, suggesting an innovative ensemble learning strategy, the ”Consensus Hybrid Ensemble Model” (CHEM)”, for intrusion detection. We combined different types of models, such as linear, nonlinear, and ensemble methods, neural networks, and probabilistic models, by using a metaclassifier approach. In this setup, a hybrid model of random forest (RF) and decision tree (DT) acts as the metaclassifier in a voting classifier, which uses consensus voting to align predictions from the various base classifiers. This method enhances the decision-making by considering each base classifier’s confidence and agreement. Local and global explanation models, such as the Shapley Additive explanations (SHAP) and Local Interpretable Model-agnostic Explanations (LIME) approaches, contributed to the primary predictions of the models’ transparency. We used different datasets for testing, such as Kdd99, NSL-KDD, CIC-IDS2017, BoTNeTIoT, and Edge-IIoTset. The proposed ”CHEM” model shows impressive performance across several attack scenarios, including novel and zero-day attacks, and proves its ability to identify and adapt to changing cyber threats. Several ablation experiments were conducted on available datasets to train, test, evaluate, and compare the proposed ”CHEM” model with the most sophisticated and state-of-the-art models. This research combines machine learning algorithms to create a precise IDS that adapts to ever-changing cyber threats.</div></div>","PeriodicalId":54784,"journal":{"name":"Journal of Network and Computer Applications","volume":"235 ","pages":"Article 104091"},"PeriodicalIF":7.7000,"publicationDate":"2024-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Network and Computer Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1084804524002686","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

Intrusion detection systems (IDSs) are dynamic to cybersecurity because they protect computer networks from malicious activity. IDS can benefit from machine learning; however, individual models may be unable to handle sophisticated and dynamic threats. Current cutting-edge research frequently concentrates on single machine-learning models for intrusion detection. They do not emphasize the necessity for more flexible and effective alternatives. The current computer network identification design techniques often need to improve efficiency and interpretability. Techniques that allow different models to operate together and adjust to dynamic network settings are required. This research addresses this gap, suggesting an innovative ensemble learning strategy, the ”Consensus Hybrid Ensemble Model” (CHEM)”, for intrusion detection. We combined different types of models, such as linear, nonlinear, and ensemble methods, neural networks, and probabilistic models, by using a metaclassifier approach. In this setup, a hybrid model of random forest (RF) and decision tree (DT) acts as the metaclassifier in a voting classifier, which uses consensus voting to align predictions from the various base classifiers. This method enhances the decision-making by considering each base classifier’s confidence and agreement. Local and global explanation models, such as the Shapley Additive explanations (SHAP) and Local Interpretable Model-agnostic Explanations (LIME) approaches, contributed to the primary predictions of the models’ transparency. We used different datasets for testing, such as Kdd99, NSL-KDD, CIC-IDS2017, BoTNeTIoT, and Edge-IIoTset. The proposed ”CHEM” model shows impressive performance across several attack scenarios, including novel and zero-day attacks, and proves its ability to identify and adapt to changing cyber threats. Several ablation experiments were conducted on available datasets to train, test, evaluate, and compare the proposed ”CHEM” model with the most sophisticated and state-of-the-art models. This research combines machine learning algorithms to create a precise IDS that adapts to ever-changing cyber threats.

查看原文本刊更多论文

基于可解释人工智能的入侵检测共识混合集成机器学习

入侵检测系统（ids）对网络安全来说是动态的，因为它保护计算机网络免受恶意活动的侵害。IDS可以从机器学习中受益；然而，单个模型可能无法处理复杂的动态威胁。当前的前沿研究往往集中在入侵检测的单一机器学习模型上。它们没有强调需要更灵活和有效的替代办法。当前的计算机网络识别设计技术往往需要提高效率和可解释性。允许不同模型一起运行并根据动态网络设置进行调整的技术是必需的。本研究解决了这一差距，提出了一种创新的集成学习策略，即“共识混合集成模型”（CHEM），用于入侵检测。我们结合了不同类型的模型，如线性、非线性和集成方法、神经网络和概率模型，通过使用元分类器方法。在此设置中，随机森林（RF）和决策树（DT）的混合模型充当投票分类器中的元分类器，该分类器使用共识投票来对齐来自各种基本分类器的预测。该方法通过考虑各个基分类器的置信度和一致性来增强决策。局部和全局解释模型，如Shapley加性解释（SHAP）和局部可解释模型不可知论解释（LIME）方法，有助于模型透明度的初步预测。我们使用不同的数据集进行测试，如Kdd99、NSL-KDD、CIC-IDS2017、BoTNeTIoT和Edge-IIoTset。提出的“CHEM”模型在几种攻击场景中表现出令人印象深刻的性能，包括新型和零日攻击，并证明了其识别和适应不断变化的网络威胁的能力。在现有数据集上进行了几次烧蚀实验，以训练、测试、评估所提出的“CHEM”模型，并将其与最复杂、最先进的模型进行比较。这项研究结合了机器学习算法，创建了一个精确的IDS，以适应不断变化的网络威胁。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Network and Computer Applications 工程技术-计算机：跨学科应用

CiteScore

21.50

自引率

3.40%

发文量

142

审稿时长

37 days

期刊介绍： The Journal of Network and Computer Applications welcomes research contributions, surveys, and notes in all areas relating to computer networks and applications thereof. Sample topics include new design techniques, interesting or novel applications, components or standards; computer networks with tools such as WWW; emerging standards for internet protocols; Wireless networks; Mobile Computing; emerging computing models such as cloud computing, grid computing; applications of networked systems for remote collaboration and telemedicine, etc. The journal is abstracted and indexed in Scopus, Engineering Index, Web of Science, Science Citation Index Expanded and INSPEC.