Supervised Learning for Detecting Cognitive Security Anomalies in Real-Time Log Data

2022 IEEE World Conference on Applied Intelligence and Computing (AIC) Pub Date : 2022-06-17 DOI:10.1109/AIC55036.2022.9848922

Md. Al Amin

{"title":"Supervised Learning for Detecting Cognitive Security Anomalies in Real-Time Log Data","authors":"Md. Al Amin","doi":"10.1109/AIC55036.2022.9848922","DOIUrl":null,"url":null,"abstract":"Every system generates a large quantity of logs. Logs are immensely crucial for monitoring a system, inspecting anomalous behaviors, and analyzing errors. By using log data, many recent studies suggest that efficient and accurate machine learning classifiers can use to detect anomalies. The source diversification and the unstructured nature of logs create difficulties in subsequent analysis. Even though the evaluation of automatic log parsing opened up the door to further research. To decrease manual work, many anomaly detection systems based on automated log analysis have been developed. However, due to the lack of a research and comparison of multiple anomaly detection methods, developers may still be unsure about which anomaly detection methods to utilize. Even if developers employ an anomaly detection technique, reimplementation takes time. To address these issues, we present a comprehensive study of detecting security anomalies from real-time log data based on different supervised machine learning algorithms and trained with publicly archived logs data. Two selected production log datasets with 15,923,592 and 365,298 log records were used to evaluate these algorithms. We also proposed an approach that the ability to provide visibility into the system, the proper way to acquire insight using a strong monitoring system that collects metrics, represents data, and automated methods to give notifications to administrators to bring to their awareness a significant change in the system's status. We assessed the model's performance using a variety of well-known assessment metrics, including “precision”, “recall”, “specificity”, “F1 measure”, and “accuracy”.","PeriodicalId":433590,"journal":{"name":"2022 IEEE World Conference on Applied Intelligence and Computing (AIC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE World Conference on Applied Intelligence and Computing (AIC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AIC55036.2022.9848922","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Every system generates a large quantity of logs. Logs are immensely crucial for monitoring a system, inspecting anomalous behaviors, and analyzing errors. By using log data, many recent studies suggest that efficient and accurate machine learning classifiers can use to detect anomalies. The source diversification and the unstructured nature of logs create difficulties in subsequent analysis. Even though the evaluation of automatic log parsing opened up the door to further research. To decrease manual work, many anomaly detection systems based on automated log analysis have been developed. However, due to the lack of a research and comparison of multiple anomaly detection methods, developers may still be unsure about which anomaly detection methods to utilize. Even if developers employ an anomaly detection technique, reimplementation takes time. To address these issues, we present a comprehensive study of detecting security anomalies from real-time log data based on different supervised machine learning algorithms and trained with publicly archived logs data. Two selected production log datasets with 15,923,592 and 365,298 log records were used to evaluate these algorithms. We also proposed an approach that the ability to provide visibility into the system, the proper way to acquire insight using a strong monitoring system that collects metrics, represents data, and automated methods to give notifications to administrators to bring to their awareness a significant change in the system's status. We assessed the model's performance using a variety of well-known assessment metrics, including “precision”, “recall”, “specificity”, “F1 measure”, and “accuracy”.

查看原文本刊更多论文

基于监督学习的实时日志数据认知安全异常检测

每个系统都会产生大量的日志。日志对于监视系统、检查异常行为和分析错误非常重要。通过使用日志数据，许多最近的研究表明，高效和准确的机器学习分类器可以用来检测异常。来源的多样化和日志的非结构化性质给后续分析带来了困难。尽管评价的自动日志解析为进一步的研究打开了大门。为了减少人工工作，人们开发了许多基于自动日志分析的异常检测系统。然而，由于缺乏对多种异常检测方法的研究和比较，开发人员可能仍然不确定使用哪种异常检测方法。即使开发人员采用了异常检测技术，重新实现也需要时间。为了解决这些问题，我们提出了一项基于不同监督机器学习算法和公开存档日志数据训练的实时日志数据检测安全异常的综合研究。选择了两个生产日志数据集，分别有15,923,592和365,298条日志记录，用于评估这些算法。我们还提出了一种方法，该方法能够提供对系统的可见性，使用强大的监视系统(收集度量、表示数据)获得洞察力的正确方法，以及向管理员发出通知的自动化方法，以使他们意识到系统状态中的重大变化。我们使用各种众所周知的评估指标来评估模型的性能，包括“精度”、“召回率”、“特异性”、“F1测度”和“准确性”。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 IEEE World Conference on Applied Intelligence and Computing (AIC)

自引率

0.00%

发文量