{"title":"Supervised Learning for Detecting Cognitive Security Anomalies in Real-Time Log Data","authors":"Md. Al Amin","doi":"10.1109/AIC55036.2022.9848922","DOIUrl":null,"url":null,"abstract":"Every system generates a large quantity of logs. Logs are immensely crucial for monitoring a system, inspecting anomalous behaviors, and analyzing errors. By using log data, many recent studies suggest that efficient and accurate machine learning classifiers can use to detect anomalies. The source diversification and the unstructured nature of logs create difficulties in subsequent analysis. Even though the evaluation of automatic log parsing opened up the door to further research. To decrease manual work, many anomaly detection systems based on automated log analysis have been developed. However, due to the lack of a research and comparison of multiple anomaly detection methods, developers may still be unsure about which anomaly detection methods to utilize. Even if developers employ an anomaly detection technique, reimplementation takes time. To address these issues, we present a comprehensive study of detecting security anomalies from real-time log data based on different supervised machine learning algorithms and trained with publicly archived logs data. Two selected production log datasets with 15,923,592 and 365,298 log records were used to evaluate these algorithms. We also proposed an approach that the ability to provide visibility into the system, the proper way to acquire insight using a strong monitoring system that collects metrics, represents data, and automated methods to give notifications to administrators to bring to their awareness a significant change in the system's status. We assessed the model's performance using a variety of well-known assessment metrics, including “precision”, “recall”, “specificity”, “F1 measure”, and “accuracy”.","PeriodicalId":433590,"journal":{"name":"2022 IEEE World Conference on Applied Intelligence and Computing (AIC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE World Conference on Applied Intelligence and Computing (AIC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AIC55036.2022.9848922","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Every system generates a large quantity of logs. Logs are immensely crucial for monitoring a system, inspecting anomalous behaviors, and analyzing errors. By using log data, many recent studies suggest that efficient and accurate machine learning classifiers can use to detect anomalies. The source diversification and the unstructured nature of logs create difficulties in subsequent analysis. Even though the evaluation of automatic log parsing opened up the door to further research. To decrease manual work, many anomaly detection systems based on automated log analysis have been developed. However, due to the lack of a research and comparison of multiple anomaly detection methods, developers may still be unsure about which anomaly detection methods to utilize. Even if developers employ an anomaly detection technique, reimplementation takes time. To address these issues, we present a comprehensive study of detecting security anomalies from real-time log data based on different supervised machine learning algorithms and trained with publicly archived logs data. Two selected production log datasets with 15,923,592 and 365,298 log records were used to evaluate these algorithms. We also proposed an approach that the ability to provide visibility into the system, the proper way to acquire insight using a strong monitoring system that collects metrics, represents data, and automated methods to give notifications to administrators to bring to their awareness a significant change in the system's status. We assessed the model's performance using a variety of well-known assessment metrics, including “precision”, “recall”, “specificity”, “F1 measure”, and “accuracy”.