Log2graphs：利用高效特征提取进行日志异常检测的无监督框架

arXiv - CS - Cryptography and Security Pub Date : 2024-09-18 DOI:arxiv-2409.11890

Caihong Wang, Du Xu, Zonghang Li

{"title":"Log2graphs：利用高效特征提取进行日志异常检测的无监督框架","authors":"Caihong Wang, Du Xu, Zonghang Li","doi":"arxiv-2409.11890","DOIUrl":null,"url":null,"abstract":"In the era of rapid Internet development, log data has become indispensable\nfor recording the operations of computer devices and software. These data\nprovide valuable insights into system behavior and necessitate thorough\nanalysis. Recent advances in text analysis have enabled deep learning to\nachieve significant breakthroughs in log anomaly detection. However, the high\ncost of manual annotation and the dynamic nature of usage scenarios present\nmajor challenges to effective log analysis. This study proposes a novel log\nfeature extraction model called DualGCN-LogAE, designed to adapt to various\nscenarios. It leverages the expressive power of large models for log content\nanalysis and the capability of graph structures to encapsulate correlations\nbetween logs. It retains key log information while integrating the causal\nrelationships between logs to achieve effective feature extraction.\nAdditionally, we introduce Log2graphs, an unsupervised log anomaly detection\nmethod based on the feature extractor. By employing graph clustering algorithms\nfor log anomaly detection, Log2graphs enables the identification of abnormal\nlogs without the need for labeled data. We comprehensively evaluate the feature\nextraction capability of DualGCN-LogAE and the anomaly detection performance of\nLog2graphs using public log datasets across five different scenarios. Our\nevaluation metrics include detection accuracy and graph clustering quality\nscores. Experimental results demonstrate that the log features extracted by\nDualGCN-LogAE outperform those obtained by other methods on classic\nclassifiers. Moreover, Log2graphs surpasses existing unsupervised log detection\nmethods, providing a robust tool for advancing log anomaly detection research.","PeriodicalId":501332,"journal":{"name":"arXiv - CS - Cryptography and Security","volume":"88 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Log2graphs: An Unsupervised Framework for Log Anomaly Detection with Efficient Feature Extraction\",\"authors\":\"Caihong Wang, Du Xu, Zonghang Li\",\"doi\":\"arxiv-2409.11890\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the era of rapid Internet development, log data has become indispensable\\nfor recording the operations of computer devices and software. These data\\nprovide valuable insights into system behavior and necessitate thorough\\nanalysis. Recent advances in text analysis have enabled deep learning to\\nachieve significant breakthroughs in log anomaly detection. However, the high\\ncost of manual annotation and the dynamic nature of usage scenarios present\\nmajor challenges to effective log analysis. This study proposes a novel log\\nfeature extraction model called DualGCN-LogAE, designed to adapt to various\\nscenarios. It leverages the expressive power of large models for log content\\nanalysis and the capability of graph structures to encapsulate correlations\\nbetween logs. It retains key log information while integrating the causal\\nrelationships between logs to achieve effective feature extraction.\\nAdditionally, we introduce Log2graphs, an unsupervised log anomaly detection\\nmethod based on the feature extractor. By employing graph clustering algorithms\\nfor log anomaly detection, Log2graphs enables the identification of abnormal\\nlogs without the need for labeled data. We comprehensively evaluate the feature\\nextraction capability of DualGCN-LogAE and the anomaly detection performance of\\nLog2graphs using public log datasets across five different scenarios. Our\\nevaluation metrics include detection accuracy and graph clustering quality\\nscores. Experimental results demonstrate that the log features extracted by\\nDualGCN-LogAE outperform those obtained by other methods on classic\\nclassifiers. Moreover, Log2graphs surpasses existing unsupervised log detection\\nmethods, providing a robust tool for advancing log anomaly detection research.\",\"PeriodicalId\":501332,\"journal\":{\"name\":\"arXiv - CS - Cryptography and Security\",\"volume\":\"88 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Cryptography and Security\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.11890\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Cryptography and Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11890","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

在互联网飞速发展的时代，记录计算机设备和软件运行情况的日志数据已变得不可或缺。这些数据为了解系统行为提供了宝贵的信息，因此有必要对其进行深入分析。文本分析领域的最新进展使得深度学习在日志异常检测方面取得了重大突破。然而，人工标注的高成本和使用场景的动态性给有效的日志分析带来了重大挑战。本研究提出了一种名为 DualGCN-LogAE 的新型日志特征提取模型，旨在适应各种场景。它利用大型模型的表现力进行日志内容分析，并利用图结构的能力封装日志之间的相关性。此外，我们还介绍了基于特征提取器的无监督日志异常检测方法 Log2graphs。通过采用图聚类算法进行日志异常检测，Log2graphs 无需标注数据即可识别异常日志。我们使用五个不同场景的公共日志数据集全面评估了 DualGCN-LogAE 的特征提取能力和 Log2graphs 的异常检测性能。评估指标包括检测准确率和图聚类质量分数。实验结果表明，在经典分类器上，DualGCN-LogAE 提取的日志特征优于其他方法提取的特征。此外，Log2graphs 还超越了现有的无监督日志检测方法，为推进日志异常检测研究提供了强大的工具。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Log2graphs: An Unsupervised Framework for Log Anomaly Detection with Efficient Feature Extraction

In the era of rapid Internet development, log data has become indispensable for recording the operations of computer devices and software. These data provide valuable insights into system behavior and necessitate thorough analysis. Recent advances in text analysis have enabled deep learning to achieve significant breakthroughs in log anomaly detection. However, the high cost of manual annotation and the dynamic nature of usage scenarios present major challenges to effective log analysis. This study proposes a novel log feature extraction model called DualGCN-LogAE, designed to adapt to various scenarios. It leverages the expressive power of large models for log content analysis and the capability of graph structures to encapsulate correlations between logs. It retains key log information while integrating the causal relationships between logs to achieve effective feature extraction. Additionally, we introduce Log2graphs, an unsupervised log anomaly detection method based on the feature extractor. By employing graph clustering algorithms for log anomaly detection, Log2graphs enables the identification of abnormal logs without the need for labeled data. We comprehensively evaluate the feature extraction capability of DualGCN-LogAE and the anomaly detection performance of Log2graphs using public log datasets across five different scenarios. Our evaluation metrics include detection accuracy and graph clustering quality scores. Experimental results demonstrate that the log features extracted by DualGCN-LogAE outperform those obtained by other methods on classic classifiers. Moreover, Log2graphs surpasses existing unsupervised log detection methods, providing a robust tool for advancing log anomaly detection research.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - CS - Cryptography and Security

自引率

0.00%

发文量