Supervised Learning for Detecting Cognitive Security Anomalies in Real-Time Log Data

Md. Al Amin
{"title":"Supervised Learning for Detecting Cognitive Security Anomalies in Real-Time Log Data","authors":"Md. Al Amin","doi":"10.1109/AIC55036.2022.9848922","DOIUrl":null,"url":null,"abstract":"Every system generates a large quantity of logs. Logs are immensely crucial for monitoring a system, inspecting anomalous behaviors, and analyzing errors. By using log data, many recent studies suggest that efficient and accurate machine learning classifiers can use to detect anomalies. The source diversification and the unstructured nature of logs create difficulties in subsequent analysis. Even though the evaluation of automatic log parsing opened up the door to further research. To decrease manual work, many anomaly detection systems based on automated log analysis have been developed. However, due to the lack of a research and comparison of multiple anomaly detection methods, developers may still be unsure about which anomaly detection methods to utilize. Even if developers employ an anomaly detection technique, reimplementation takes time. To address these issues, we present a comprehensive study of detecting security anomalies from real-time log data based on different supervised machine learning algorithms and trained with publicly archived logs data. Two selected production log datasets with 15,923,592 and 365,298 log records were used to evaluate these algorithms. We also proposed an approach that the ability to provide visibility into the system, the proper way to acquire insight using a strong monitoring system that collects metrics, represents data, and automated methods to give notifications to administrators to bring to their awareness a significant change in the system's status. We assessed the model's performance using a variety of well-known assessment metrics, including “precision”, “recall”, “specificity”, “F1 measure”, and “accuracy”.","PeriodicalId":433590,"journal":{"name":"2022 IEEE World Conference on Applied Intelligence and Computing (AIC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE World Conference on Applied Intelligence and Computing (AIC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AIC55036.2022.9848922","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Every system generates a large quantity of logs. Logs are immensely crucial for monitoring a system, inspecting anomalous behaviors, and analyzing errors. By using log data, many recent studies suggest that efficient and accurate machine learning classifiers can use to detect anomalies. The source diversification and the unstructured nature of logs create difficulties in subsequent analysis. Even though the evaluation of automatic log parsing opened up the door to further research. To decrease manual work, many anomaly detection systems based on automated log analysis have been developed. However, due to the lack of a research and comparison of multiple anomaly detection methods, developers may still be unsure about which anomaly detection methods to utilize. Even if developers employ an anomaly detection technique, reimplementation takes time. To address these issues, we present a comprehensive study of detecting security anomalies from real-time log data based on different supervised machine learning algorithms and trained with publicly archived logs data. Two selected production log datasets with 15,923,592 and 365,298 log records were used to evaluate these algorithms. We also proposed an approach that the ability to provide visibility into the system, the proper way to acquire insight using a strong monitoring system that collects metrics, represents data, and automated methods to give notifications to administrators to bring to their awareness a significant change in the system's status. We assessed the model's performance using a variety of well-known assessment metrics, including “precision”, “recall”, “specificity”, “F1 measure”, and “accuracy”.
基于监督学习的实时日志数据认知安全异常检测
每个系统都会产生大量的日志。日志对于监视系统、检查异常行为和分析错误非常重要。通过使用日志数据,许多最近的研究表明,高效和准确的机器学习分类器可以用来检测异常。来源的多样化和日志的非结构化性质给后续分析带来了困难。尽管评价的自动日志解析为进一步的研究打开了大门。为了减少人工工作,人们开发了许多基于自动日志分析的异常检测系统。然而,由于缺乏对多种异常检测方法的研究和比较,开发人员可能仍然不确定使用哪种异常检测方法。即使开发人员采用了异常检测技术,重新实现也需要时间。为了解决这些问题,我们提出了一项基于不同监督机器学习算法和公开存档日志数据训练的实时日志数据检测安全异常的综合研究。选择了两个生产日志数据集,分别有15,923,592和365,298条日志记录,用于评估这些算法。我们还提出了一种方法,该方法能够提供对系统的可见性,使用强大的监视系统(收集度量、表示数据)获得洞察力的正确方法,以及向管理员发出通知的自动化方法,以使他们意识到系统状态中的重大变化。我们使用各种众所周知的评估指标来评估模型的性能,包括“精度”、“召回率”、“特异性”、“F1测度”和“准确性”。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信