Effectiveness of an ensemble technique based on the distributivity equation in detecting suspicious network activity

IF 3.2 1区 数学 Q2 COMPUTER SCIENCE, THEORY & METHODS
Ewa Rak , Jaromir Sarzyński , Rafał Rak
{"title":"Effectiveness of an ensemble technique based on the distributivity equation in detecting suspicious network activity","authors":"Ewa Rak ,&nbsp;Jaromir Sarzyński ,&nbsp;Rafał Rak","doi":"10.1016/j.fss.2024.109015","DOIUrl":null,"url":null,"abstract":"<div><p>With the growing complexity and frequency of cyber threats, there is a pressing need for more effective defense mechanisms. Machine learning offers the potential to analyze vast amounts of data and identify patterns indicative of malicious activity, enabling faster and more accurate threat detection. Ensemble methods, by incorporating diverse models with varying vulnerabilities, can increase resilience against adversarial attacks. This study covers the usage and evaluation of the relevance of an innovative approach of ensemble classification for identifying intrusion threats on a large CICIDS2017 dataset. The approach is based on the distributivity equation that appropriately aggregates the underlying classifiers. It combines various standard supervised classification algorithms, including Multilayer Perceptron Network, k-Nearest Neighbors, and Naive Bayes, to create an ensemble. Experiments were conducted to evaluate the effectiveness of the proposed hybrid ensemble method. The performance of the ensemble approach was compared with individual classifiers using measures such as accuracy, precision, recall, <em>F</em>-score, and area under the ROC curve. Additionally, comparisons were made with widely used state-of-the-art ensemble models, including the soft voting method (Weighted Average Probabilities), Adaptive Boosting (AdaBoost), and Histogram-based Gradient Boosting Classification Tree (HGBC) and with existing methods in the literature using the same dataset, such as Deep Belief Networks (DBN), Deep Feature Learning via Graph (Deep GFL). Based on these experiments, it was found that some ensemble methods, such as AdaBoost and Histogram-based Gradient Classification Tree, do not perform reliably for the specific task of identifying network attacks. This highlights the importance of understanding the context and requirements of the data and problem domain. The results indicate that the proposed hybrid ensemble method outperforms traditional algorithms in terms of classification precision and accuracy, and offers insights for improving the effectiveness of intrusion detection systems.</p></div>","PeriodicalId":55130,"journal":{"name":"Fuzzy Sets and Systems","volume":null,"pages":null},"PeriodicalIF":3.2000,"publicationDate":"2024-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Fuzzy Sets and Systems","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0165011424001611","RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

With the growing complexity and frequency of cyber threats, there is a pressing need for more effective defense mechanisms. Machine learning offers the potential to analyze vast amounts of data and identify patterns indicative of malicious activity, enabling faster and more accurate threat detection. Ensemble methods, by incorporating diverse models with varying vulnerabilities, can increase resilience against adversarial attacks. This study covers the usage and evaluation of the relevance of an innovative approach of ensemble classification for identifying intrusion threats on a large CICIDS2017 dataset. The approach is based on the distributivity equation that appropriately aggregates the underlying classifiers. It combines various standard supervised classification algorithms, including Multilayer Perceptron Network, k-Nearest Neighbors, and Naive Bayes, to create an ensemble. Experiments were conducted to evaluate the effectiveness of the proposed hybrid ensemble method. The performance of the ensemble approach was compared with individual classifiers using measures such as accuracy, precision, recall, F-score, and area under the ROC curve. Additionally, comparisons were made with widely used state-of-the-art ensemble models, including the soft voting method (Weighted Average Probabilities), Adaptive Boosting (AdaBoost), and Histogram-based Gradient Boosting Classification Tree (HGBC) and with existing methods in the literature using the same dataset, such as Deep Belief Networks (DBN), Deep Feature Learning via Graph (Deep GFL). Based on these experiments, it was found that some ensemble methods, such as AdaBoost and Histogram-based Gradient Classification Tree, do not perform reliably for the specific task of identifying network attacks. This highlights the importance of understanding the context and requirements of the data and problem domain. The results indicate that the proposed hybrid ensemble method outperforms traditional algorithms in terms of classification precision and accuracy, and offers insights for improving the effectiveness of intrusion detection systems.

基于分布方程的集合技术在检测可疑网络活动方面的有效性
随着网络威胁日趋复杂和频繁,迫切需要更有效的防御机制。机器学习提供了分析海量数据和识别恶意活动模式的潜力,从而实现更快、更准确的威胁检测。集合方法通过整合具有不同漏洞的各种模型,可以提高抵御对抗性攻击的能力。本研究介绍了在大型 CICIDS2017 数据集上识别入侵威胁的组合分类创新方法的使用和相关性评估。该方法基于分布式方程,可适当聚合底层分类器。它结合了各种标准监督分类算法,包括多层感知器网络、k-近邻和奈维贝叶斯,创建了一个集合。为了评估所提出的混合集合方法的有效性,我们进行了实验。使用准确率、精确度、召回率、F-分数和 ROC 曲线下面积等指标,将集合方法的性能与单个分类器进行了比较。此外,还与广泛使用的最先进的集合模型进行了比较,包括软投票法(加权平均概率)、自适应提升法(AdaBoost)和基于直方图的梯度提升分类树(HGBC),以及使用相同数据集的现有文献方法,如深度信念网络(DBN)、通过图的深度特征学习(Deep GFL)。基于这些实验,我们发现一些集合方法,如 AdaBoost 和基于直方图的梯度分类树,在识别网络攻击的特定任务中表现并不可靠。这凸显了了解数据和问题领域的背景和要求的重要性。结果表明,所提出的混合集合方法在分类精度和准确性方面优于传统算法,并为提高入侵检测系统的有效性提供了启示。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Fuzzy Sets and Systems
Fuzzy Sets and Systems 数学-计算机:理论方法
CiteScore
6.50
自引率
17.90%
发文量
321
审稿时长
6.1 months
期刊介绍: Since its launching in 1978, the journal Fuzzy Sets and Systems has been devoted to the international advancement of the theory and application of fuzzy sets and systems. The theory of fuzzy sets now encompasses a well organized corpus of basic notions including (and not restricted to) aggregation operations, a generalized theory of relations, specific measures of information content, a calculus of fuzzy numbers. Fuzzy sets are also the cornerstone of a non-additive uncertainty theory, namely possibility theory, and of a versatile tool for both linguistic and numerical modeling: fuzzy rule-based systems. Numerous works now combine fuzzy concepts with other scientific disciplines as well as modern technologies. In mathematics fuzzy sets have triggered new research topics in connection with category theory, topology, algebra, analysis. Fuzzy sets are also part of a recent trend in the study of generalized measures and integrals, and are combined with statistical methods. Furthermore, fuzzy sets have strong logical underpinnings in the tradition of many-valued logics.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信