SemiRALD: A semi-supervised hybrid language model for robust Anomalous Log Detection

IF 4.3 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information and Software Technology Pub Date : 2025-04-11 DOI:10.1016/j.infsof.2025.107743

Yicheng Sun , Jacky Wai Keung , Zhen Yang , Shuo Liu , Hi Kuen Yu

{"title":"SemiRALD: A semi-supervised hybrid language model for robust Anomalous Log Detection","authors":"Yicheng Sun , Jacky Wai Keung , Zhen Yang , Shuo Liu , Hi Kuen Yu","doi":"10.1016/j.infsof.2025.107743","DOIUrl":null,"url":null,"abstract":"<div><h3>Context:</h3><div>Deep learning-based Anomalous Log Detection (DALD) tools are critical for software reliability, but current approaches face challenges, including information loss during log parsing, reliance on large labeled datasets, and fragility in low-resource scenarios.</div></div><div><h3>Objective:</h3><div>To overcome the above limitations, we propose SemiRALD, a semi-supervised learning-based robust ALD approach that leverages Large Language Model (LLM) for log parsing, enhancing both flexibility and accuracy. It utilizes a hybrid language model to repeatedly fit the samples with generate pseudo-labels, thereby training DALD models with limited resources and facilitating efficient anomaly detection tasks.</div></div><div><h3>Method:</h3><div>In detail, SemiRALD utilizes ChatGPT and in-context learning for automated log parsing, thereby improving the log integrity during log parsing. Subsequently, it harnesses a semi-supervised learning framework and our proposed hybrid language model to remedy the performance degeneration caused by low-resource restriction in practice. Semi-supervised learning requires only a small amount of labeled data throughout the entire process, while the hybrid language model is built on the architecture of RoBERTa and an attention-based BiLSTM.</div></div><div><h3>Results:</h3><div>Experiments on the HDFS and BGL datasets demonstrate that SemiRALD achieves an average F1-score improvement of 7.3% and 8.2%, respectively, over seven benchmark models. On small-scale datasets (0.1% of the original size), SemiRALD outperforms competitors by 31.4% and 46.0% in F1-score, respectively. Its consistent performance across diverse datasets highlights its generalizability and robustness.</div></div><div><h3>Conclusion:</h3><div>SemiRALD is capable of handling anomaly detection tasks in both large-scale and low-resource datasets, delivering significant advancements in anomaly log detection and offering robust, adaptable solutions to address prevalent challenges in the field of software reliability engineering.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"183 ","pages":"Article 107743"},"PeriodicalIF":4.3000,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information and Software Technology","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950584925000825","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Context:

Deep learning-based Anomalous Log Detection (DALD) tools are critical for software reliability, but current approaches face challenges, including information loss during log parsing, reliance on large labeled datasets, and fragility in low-resource scenarios.

Objective:

To overcome the above limitations, we propose SemiRALD, a semi-supervised learning-based robust ALD approach that leverages Large Language Model (LLM) for log parsing, enhancing both flexibility and accuracy. It utilizes a hybrid language model to repeatedly fit the samples with generate pseudo-labels, thereby training DALD models with limited resources and facilitating efficient anomaly detection tasks.

Method:

In detail, SemiRALD utilizes ChatGPT and in-context learning for automated log parsing, thereby improving the log integrity during log parsing. Subsequently, it harnesses a semi-supervised learning framework and our proposed hybrid language model to remedy the performance degeneration caused by low-resource restriction in practice. Semi-supervised learning requires only a small amount of labeled data throughout the entire process, while the hybrid language model is built on the architecture of RoBERTa and an attention-based BiLSTM.

Results:

Experiments on the HDFS and BGL datasets demonstrate that SemiRALD achieves an average F1-score improvement of 7.3% and 8.2%, respectively, over seven benchmark models. On small-scale datasets (0.1% of the original size), SemiRALD outperforms competitors by 31.4% and 46.0% in F1-score, respectively. Its consistent performance across diverse datasets highlights its generalizability and robustness.

Conclusion:

SemiRALD is capable of handling anomaly detection tasks in both large-scale and low-resource datasets, delivering significant advancements in anomaly log detection and offering robust, adaptable solutions to address prevalent challenges in the field of software reliability engineering.

查看原文本刊更多论文

半监督混合语言模型用于鲁棒异常日志检测

背景：基于深度学习的异常日志检测（dal）工具对软件可靠性至关重要，但目前的方法面临挑战，包括日志解析过程中的信息丢失、对大型标记数据集的依赖以及低资源场景中的脆弱性。为了克服上述限制，我们提出了一种基于半监督学习的鲁棒ALD方法，该方法利用大型语言模型（LLM）进行日志解析，提高了灵活性和准确性。利用混合语言模型对生成的伪标签样本进行重复拟合，从而在有限的资源下训练DALD模型，实现高效的异常检测任务。方法：SemiRALD利用ChatGPT和上下文学习实现日志自动解析，提高了日志解析过程中的日志完整性。随后，利用半监督学习框架和我们提出的混合语言模型来弥补实践中由于低资源限制而导致的性能下降。半监督学习在整个过程中只需要少量的标记数据，而混合语言模型是建立在RoBERTa和基于注意力的BiLSTM的架构上的。结果：在HDFS和BGL数据集上的实验表明，与7个基准模型相比，SemiRALD的平均f1分数分别提高了7.3%和8.2%。在小规模数据集（原始大小的0.1%）上，SemiRALD的f1得分分别高出竞争对手31.4%和46.0%。它在不同数据集上的一致性能突出了它的泛化性和鲁棒性。结论：SemiRALD能够处理大规模和低资源数据集的异常检测任务，在异常日志检测方面取得重大进展，并提供鲁棒性、适应性强的解决方案，以应对软件可靠性工程领域的普遍挑战。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Information and Software Technology 工程技术-计算机：软件工程

CiteScore

9.10

自引率

7.70%

发文量

164

审稿时长

9.6 weeks

期刊介绍： Information and Software Technology is the international archival journal focusing on research and experience that contributes to the improvement of software development practices. The journal''s scope includes methods and techniques to better engineer software and manage its development. Articles submitted for review should have a clear component of software engineering or address ways to improve the engineering and management of software development. Areas covered by the journal include: • Software management, quality and metrics, • Software processes, • Software architecture, modelling, specification, design and programming • Functional and non-functional software requirements • Software testing and verification & validation • Empirical studies of all aspects of engineering and managing software development Short Communications is a new section dedicated to short papers addressing new ideas, controversial opinions, "Negative" results and much more. Read the Guide for authors for more information. The journal encourages and welcomes submissions of systematic literature studies (reviews and maps) within the scope of the journal. Information and Software Technology is the premiere outlet for systematic literature studies in software engineering.