BiCAM: A Bidirectional Contextualized Attentive Model for Analyzing the Correlation of Heterogeneous Security Events

IF 5.7 2区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
Xi Luo;Junhui Wang;Lihua Yin;Kaiyan Zhao;Kexiang Qian;Daojuan Zhang;Kai Chen
{"title":"BiCAM: A Bidirectional Contextualized Attentive Model for Analyzing the Correlation of Heterogeneous Security Events","authors":"Xi Luo;Junhui Wang;Lihua Yin;Kaiyan Zhao;Kexiang Qian;Daojuan Zhang;Kai Chen","doi":"10.1109/TR.2024.3491894","DOIUrl":null,"url":null,"abstract":"As the Internet continues to evolve, modern information technology infrastructures are constantly under attack and need to be continuously monitored for timely responses. Different devices and detection platforms generate heterogeneous security events that are sent to security operations centers, where security operators investigate those events and identify potential threats. Unfortunately, it is impossible to manually analyze such a huge number of events, leading to “alert fatigue.” Despite a substantial amount of effort having been made to aggregate redundant related alerts, the effectiveness of previous works was essentially restrained by their limited relation learning and explaining abilities. In this work, we propose the bidirectional contextualized attentive model (BiCAM), a novel contextual analysis model that uses a self-supervised deep learning approach to automatically correlate security events in relation to their bidirectional context. It is developed by designing an encoder–decoder architecture that consists of bidirectional gated recurrent units and an attention mechanism to capture both sequential and nonsequential relations of previous and subsequent alerts and provide explainability information for the security operators. In addition, we introduce a bidirectional encoder representations from transformers (BERT)-based embedding method to deal with the heterogeneity of security events, enhancing our model's accommodation to the changes of detectors. We comprehensively evaluate our model on real-world datasets containing over 11M events generated by detectors from 8 different vendors. We found that our model enables accurate, unsupervised correlation extraction; and outperforms the state-of-the-art (SOTA) work when applying event relevance to semiautomatically classify security events (e.g., the <inline-formula><tex-math>$F1$</tex-math></inline-formula>-score of classification is improved by 4.3% and the false positive rate dropped to 1.39%).","PeriodicalId":56305,"journal":{"name":"IEEE Transactions on Reliability","volume":"74 2","pages":"2640-2654"},"PeriodicalIF":5.7000,"publicationDate":"2024-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10777841","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Reliability","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10777841/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

As the Internet continues to evolve, modern information technology infrastructures are constantly under attack and need to be continuously monitored for timely responses. Different devices and detection platforms generate heterogeneous security events that are sent to security operations centers, where security operators investigate those events and identify potential threats. Unfortunately, it is impossible to manually analyze such a huge number of events, leading to “alert fatigue.” Despite a substantial amount of effort having been made to aggregate redundant related alerts, the effectiveness of previous works was essentially restrained by their limited relation learning and explaining abilities. In this work, we propose the bidirectional contextualized attentive model (BiCAM), a novel contextual analysis model that uses a self-supervised deep learning approach to automatically correlate security events in relation to their bidirectional context. It is developed by designing an encoder–decoder architecture that consists of bidirectional gated recurrent units and an attention mechanism to capture both sequential and nonsequential relations of previous and subsequent alerts and provide explainability information for the security operators. In addition, we introduce a bidirectional encoder representations from transformers (BERT)-based embedding method to deal with the heterogeneity of security events, enhancing our model's accommodation to the changes of detectors. We comprehensively evaluate our model on real-world datasets containing over 11M events generated by detectors from 8 different vendors. We found that our model enables accurate, unsupervised correlation extraction; and outperforms the state-of-the-art (SOTA) work when applying event relevance to semiautomatically classify security events (e.g., the $F1$-score of classification is improved by 4.3% and the false positive rate dropped to 1.39%).
BiCAM:一种分析异构安全事件相关性的双向情境化关注模型
随着互联网的不断发展,现代信息技术基础设施不断受到攻击,需要持续监测以及时作出反应。不同的设备和检测平台产生不同的安全事件,这些事件被发送到安全运营中心,安全操作员在安全运营中心调查这些事件并识别潜在威胁。不幸的是,手动分析如此大量的事件是不可能的,这会导致“警报疲劳”。尽管已经做出了大量的努力来汇总冗余的相关警报,但以前的工作的有效性基本上受到其有限的关系学习和解释能力的限制。在这项工作中,我们提出了双向情境化关注模型(BiCAM),这是一种新的情境分析模型,它使用自监督深度学习方法自动将安全事件与其双向上下文关联起来。它通过设计一个由双向门控循环单元和注意机制组成的编码器-解码器体系结构来开发,以捕获前后警报的顺序和非顺序关系,并为安全操作员提供可解释性信息。此外,我们引入了一种基于双向编码器表示的基于变压器(BERT)的嵌入方法来处理安全事件的异质性,增强了模型对检测器变化的适应能力。我们在包含来自8个不同供应商的检测器生成的超过11M个事件的真实数据集上全面评估了我们的模型。我们发现,我们的模型能够实现准确的、无监督的相关性提取;并且在将事件相关性应用于半自动分类安全事件时,优于最先进(SOTA)的工作(例如,分类的$F1$-分数提高了4.3%,假阳性率下降到1.39%)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IEEE Transactions on Reliability
IEEE Transactions on Reliability 工程技术-工程:电子与电气
CiteScore
12.20
自引率
8.50%
发文量
153
审稿时长
7.5 months
期刊介绍: IEEE Transactions on Reliability is a refereed journal for the reliability and allied disciplines including, but not limited to, maintainability, physics of failure, life testing, prognostics, design and manufacture for reliability, reliability for systems of systems, network availability, mission success, warranty, safety, and various measures of effectiveness. Topics eligible for publication range from hardware to software, from materials to systems, from consumer and industrial devices to manufacturing plants, from individual items to networks, from techniques for making things better to ways of predicting and measuring behavior in the field. As an engineering subject that supports new and existing technologies, we constantly expand into new areas of the assurance sciences.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信