Cyberattack event logs classification using deep learning with semantic feature analysis

IF 4.8 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Computers & Security Pub Date : 2024-11-26 DOI:10.1016/j.cose.2024.104222

Ahmad Alzu’bi , Omar Darwish , Amjad Albashayreh , Yahya Tashtoush

{"title":"Cyberattack event logs classification using deep learning with semantic feature analysis","authors":"Ahmad Alzu’bi , Omar Darwish , Amjad Albashayreh , Yahya Tashtoush","doi":"10.1016/j.cose.2024.104222","DOIUrl":null,"url":null,"abstract":"<div><div>Event logs play a crucial role in cybersecurity by detecting potentially malicious network activities and preventing data loss or theft. Previous work did not place a high value on log messages and their impact on security breach prediction and intrusion detection. This research paper introduces a novel approach for log message analysis applied to a dataset of event logs collected from various web sources. Event log messages were analyzed and categorized based on event and attack types with an explainable AI emphasizing the value of its key data. The study aims to enhance intrusion detection and minimize performance degradation by identifying suspicious events. In this regard, a new semantic vectorization framework is proposed, leveraging deep learning architectures to develop semantic discriminating log features, offering a cogent explanation and classification of event log messages. The use of BERT deep embeddings as a baseline for the prediction model allows for visualizing and interpreting the formulation of log message semantic features. Several empirical scenarios are set and conducted extensively to evaluate the performance of the event log classifier, considering the attack type, event type, and zero-shot logs. The experimental results demonstrate that the proposed event log classifier outperforms state-of-the-art machine learning models, achieving a recall of 99.27% and a precision of 99.29%. This highlights the model’s ability to accurately identify events of a particular type by detecting as many suspicious events as feasible while minimizing the misclassification rate.</div></div>","PeriodicalId":51004,"journal":{"name":"Computers & Security","volume":"150 ","pages":"Article 104222"},"PeriodicalIF":4.8000,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Security","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167404824005285","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Event logs play a crucial role in cybersecurity by detecting potentially malicious network activities and preventing data loss or theft. Previous work did not place a high value on log messages and their impact on security breach prediction and intrusion detection. This research paper introduces a novel approach for log message analysis applied to a dataset of event logs collected from various web sources. Event log messages were analyzed and categorized based on event and attack types with an explainable AI emphasizing the value of its key data. The study aims to enhance intrusion detection and minimize performance degradation by identifying suspicious events. In this regard, a new semantic vectorization framework is proposed, leveraging deep learning architectures to develop semantic discriminating log features, offering a cogent explanation and classification of event log messages. The use of BERT deep embeddings as a baseline for the prediction model allows for visualizing and interpreting the formulation of log message semantic features. Several empirical scenarios are set and conducted extensively to evaluate the performance of the event log classifier, considering the attack type, event type, and zero-shot logs. The experimental results demonstrate that the proposed event log classifier outperforms state-of-the-art machine learning models, achieving a recall of 99.27% and a precision of 99.29%. This highlights the model’s ability to accurately identify events of a particular type by detecting as many suspicious events as feasible while minimizing the misclassification rate.

查看原文本刊更多论文

基于深度学习和语义特征分析的网络攻击事件日志分类

事件日志通过检测潜在的恶意网络活动和防止数据丢失或被盗，在网络安全中发挥着至关重要的作用。以前的工作并没有高度重视日志消息及其对安全漏洞预测和入侵检测的影响。本文介绍了一种新的日志消息分析方法，该方法应用于从各种web来源收集的事件日志数据集。事件日志消息根据事件和攻击类型进行分析和分类，并使用可解释的AI强调其关键数据的价值。该研究旨在通过识别可疑事件来增强入侵检测，并将性能下降降至最低。在这方面，提出了一种新的语义矢量化框架，利用深度学习架构开发语义区分日志特征，为事件日志消息提供令人信服的解释和分类。使用BERT深度嵌入作为预测模型的基线，可以可视化和解释日志消息语义特征的表述。考虑到攻击类型、事件类型和零射击日志，设置并广泛地进行了几个经验场景来评估事件日志分类器的性能。实验结果表明，所提出的事件日志分类器优于最先进的机器学习模型，达到了99.27%的召回率和99.29%的精度。这突出了模型通过检测尽可能多的可疑事件来准确识别特定类型事件的能力，同时将错误分类率降至最低。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computers & Security 工程技术-计算机：信息系统

CiteScore

12.40

自引率

7.10%

发文量

365

审稿时长

10.7 months

期刊介绍： Computers & Security is the most respected technical journal in the IT security field. With its high-profile editorial board and informative regular features and columns, the journal is essential reading for IT security professionals around the world. Computers & Security provides you with a unique blend of leading edge research and sound practical management advice. It is aimed at the professional involved with computer security, audit, control and data integrity in all sectors - industry, commerce and academia. Recognized worldwide as THE primary source of reference for applied research and technical expertise it is your first step to fully secure systems.