{"title":"Log Anomaly Detection Method based on Hybrid Transformer-BiLSTM Models","authors":"Xuedong Ou, J. Liu","doi":"10.1109/QRS-C57518.2022.00123","DOIUrl":null,"url":null,"abstract":"Log analysis is quite significant for reliability issues in large cloud data centers. There are noticeable problems in log anomaly detection, such as single feature extraction, unsatisfactory anomaly detection effect. In this paper, we propose a novel log anomaly detection method, which could be divided into two related parts. First, a dataset partitioning method is proposed, named K-fold Sub Hold-out Method (KSHM), which is built on the features of logs to preserve the temporality of training data when sampling. KSHM could enhance the effectiveness of sampling without increasing the number of samples, and change the way the model is trained. Second, an anomaly detection model based on hybrid Transformer-BiLSTM (TFBL) is well constructed, which could extract both temporal and semantic features of logs to serve as a source of features for comprehensive anomaly detection. Experiment results show that TFBL outperforms baseline methods in assessment criteria of accuracy, precision and F1-score, and our log anomaly detection method based on integrated KSHM and TFBL also has better anomaly detection performence.","PeriodicalId":183728,"journal":{"name":"2022 IEEE 22nd International Conference on Software Quality, Reliability, and Security Companion (QRS-C)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 22nd International Conference on Software Quality, Reliability, and Security Companion (QRS-C)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/QRS-C57518.2022.00123","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Log analysis is quite significant for reliability issues in large cloud data centers. There are noticeable problems in log anomaly detection, such as single feature extraction, unsatisfactory anomaly detection effect. In this paper, we propose a novel log anomaly detection method, which could be divided into two related parts. First, a dataset partitioning method is proposed, named K-fold Sub Hold-out Method (KSHM), which is built on the features of logs to preserve the temporality of training data when sampling. KSHM could enhance the effectiveness of sampling without increasing the number of samples, and change the way the model is trained. Second, an anomaly detection model based on hybrid Transformer-BiLSTM (TFBL) is well constructed, which could extract both temporal and semantic features of logs to serve as a source of features for comprehensive anomaly detection. Experiment results show that TFBL outperforms baseline methods in assessment criteria of accuracy, precision and F1-score, and our log anomaly detection method based on integrated KSHM and TFBL also has better anomaly detection performence.
日志分析对于大型云数据中心的可靠性问题非常重要。在日志异常检测中存在特征提取单一、异常检测效果不理想等问题。本文提出了一种新的测井异常检测方法,该方法可分为两个相关部分。首先,提出了一种基于日志特征的数据集划分方法K-fold Sub - hold method (KSHM),该方法在采样时保持训练数据的时效性;KSHM可以在不增加样本数量的情况下提高采样的有效性,并改变模型的训练方式。其次,构建了基于混合Transformer-BiLSTM (TFBL)的异常检测模型,该模型可以同时提取日志的时间特征和语义特征,作为综合异常检测的特征源;实验结果表明,TFBL在准确度、精密度和f1评分的评价指标上优于基线方法,基于KSHM和TFBL的测井异常检测方法也具有更好的异常检测性能。