PANACEA: a neural model ensemble for cyber-threat detection

IF 4.3 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Machine Learning Pub Date : 2024-01-12 DOI:10.1007/s10994-023-06470-2

Malik AL-Essa, Giuseppina Andresini, Annalisa Appice, Donato Malerba

{"title":"PANACEA: a neural model ensemble for cyber-threat detection","authors":"Malik AL-Essa, Giuseppina Andresini, Annalisa Appice, Donato Malerba","doi":"10.1007/s10994-023-06470-2","DOIUrl":null,"url":null,"abstract":"<p>Ensemble learning is a strategy commonly used to fuse different base models by creating a model ensemble that is expected more accurate on unseen data than the base models. This study describes a new cyber-threat detection method, called <span>PANACEA</span>, that uses ensemble learning coupled with adversarial training in deep learning, in order to gain accuracy with neural models trained in cybersecurity problems. The selection of the base models is one of the main challenges to handle, in order to train accurate ensembles. This study describes a model ensemble pruning approach based on eXplainable AI (XAI) to increase the ensemble diversity and gain accuracy in ensemble classification. We base on the idea that being able to identify base models that give relevance to different input feature sub-spaces may help in improving the accuracy of an ensemble trained to recognise different signatures of different cyber-attack patterns. To this purpose, we use a global XAI technique to measure the ensemble model diversity with respect to the effect of the input features on the accuracy of the base neural models combined in the ensemble. Experiments carried out on four benchmark cybersecurity datasets (three network intrusion detection datasets and one malware detection dataset) show the beneficial effects of the proposed combination of adversarial training, ensemble learning and XAI on the accuracy of multi-class classifications of cyber-data achieved by the neural model ensemble.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"30 1","pages":""},"PeriodicalIF":4.3000,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Learning","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10994-023-06470-2","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Ensemble learning is a strategy commonly used to fuse different base models by creating a model ensemble that is expected more accurate on unseen data than the base models. This study describes a new cyber-threat detection method, called PANACEA, that uses ensemble learning coupled with adversarial training in deep learning, in order to gain accuracy with neural models trained in cybersecurity problems. The selection of the base models is one of the main challenges to handle, in order to train accurate ensembles. This study describes a model ensemble pruning approach based on eXplainable AI (XAI) to increase the ensemble diversity and gain accuracy in ensemble classification. We base on the idea that being able to identify base models that give relevance to different input feature sub-spaces may help in improving the accuracy of an ensemble trained to recognise different signatures of different cyber-attack patterns. To this purpose, we use a global XAI technique to measure the ensemble model diversity with respect to the effect of the input features on the accuracy of the base neural models combined in the ensemble. Experiments carried out on four benchmark cybersecurity datasets (three network intrusion detection datasets and one malware detection dataset) show the beneficial effects of the proposed combination of adversarial training, ensemble learning and XAI on the accuracy of multi-class classifications of cyber-data achieved by the neural model ensemble.

Abstract Image

查看原文本刊更多论文

PANACEA：用于网络威胁检测的神经模型组合

集合学习是一种常用的策略，通过创建一个模型集合来融合不同的基础模型，该模型集合有望在未见数据上比基础模型更准确。本研究介绍了一种名为 PANACEA 的新型网络威胁检测方法，该方法使用集合学习与深度学习中的对抗训练相结合，以获得针对网络安全问题训练的神经模型的准确性。为了训练出精确的集合，基础模型的选择是需要应对的主要挑战之一。本研究介绍了一种基于可扩展人工智能（XAI）的模型集合修剪方法，以增加集合多样性并提高集合分类的准确性。我们的想法是，能够识别与不同输入特征子空间相关的基本模型，可能有助于提高为识别不同网络攻击模式的不同特征而训练的集合的准确性。为此，我们使用全局 XAI 技术来测量集合模型的多样性，即输入特征对集合中组合的基础神经模型准确性的影响。在四个基准网络安全数据集（三个网络入侵检测数据集和一个恶意软件检测数据集）上进行的实验表明，建议的对抗训练、集合学习和 XAI 的组合对神经模型集合实现网络数据多类分类的准确性产生了有利影响。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Machine Learning 工程技术-计算机：人工智能

CiteScore

11.00

自引率

2.70%

发文量

162

审稿时长

3 months

期刊介绍： Machine Learning serves as a global platform dedicated to computational approaches in learning. The journal reports substantial findings on diverse learning methods applied to various problems, offering support through empirical studies, theoretical analysis, or connections to psychological phenomena. It demonstrates the application of learning methods to solve significant problems and aims to enhance the conduct of machine learning research with a focus on verifiable and replicable evidence in published papers.