Heterogeneous data-driven resolution generation for software systems via large language models

IF 6.9 1区管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Processing & Management Pub Date : 2025-09-09 DOI:10.1016/j.ipm.2025.104376

Wen Liu , Degang Sun , Haitian Yang , Yan Wang , Weiqing Huang

{"title":"Heterogeneous data-driven resolution generation for software systems via large language models","authors":"Wen Liu , Degang Sun , Haitian Yang , Yan Wang , Weiqing Huang","doi":"10.1016/j.ipm.2025.104376","DOIUrl":null,"url":null,"abstract":"<div><div>Modern software systems are increasingly complex and dynamic, making them particularly vulnerable to performance anomalies. Although runtime anomaly detection enhances system reliability, engineers still devote considerable time and effort to resolving errors once anomalous logs or metrics are detected. Such challenges call for intelligent automation capable of delivering targeted remediation steps based on detected anomalies. In this work, we first construct an anomaly-related knowledge base by combining heterogeneous operational data, including logs and metrics, with resolutions annotated by domain experts. Furthermore, we propose HASolver, the first Heterogeneous Anomaly Solver to generate recommended resolutions for multi-source system anomalies. The core component is a dual-view multi-vector module, designed to represent heterogeneous anomaly chunks from different modalities and to support effective multi-vector retrieval. HASolver integrates a large language model with domain knowledge to generate mitigation resolutions. We conduct extensive experiments using BLEU and ROUGE-1/2/L metrics. Compared to baseline approaches, HASolver delivers notable performance gains, improving BLEU and ROUGE-L scores by 14.6% and 19.6%, respectively. Further analyses are carried out to explore various multi-vector configurations and the effect of prompt strategies. We also release the annotated resolution dataset derived from the anomaly-related knowledge base to facilitate future research.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 2","pages":"Article 104376"},"PeriodicalIF":6.9000,"publicationDate":"2025-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing & Management","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306457325003176","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Modern software systems are increasingly complex and dynamic, making them particularly vulnerable to performance anomalies. Although runtime anomaly detection enhances system reliability, engineers still devote considerable time and effort to resolving errors once anomalous logs or metrics are detected. Such challenges call for intelligent automation capable of delivering targeted remediation steps based on detected anomalies. In this work, we first construct an anomaly-related knowledge base by combining heterogeneous operational data, including logs and metrics, with resolutions annotated by domain experts. Furthermore, we propose HASolver, the first Heterogeneous Anomaly Solver to generate recommended resolutions for multi-source system anomalies. The core component is a dual-view multi-vector module, designed to represent heterogeneous anomaly chunks from different modalities and to support effective multi-vector retrieval. HASolver integrates a large language model with domain knowledge to generate mitigation resolutions. We conduct extensive experiments using BLEU and ROUGE-1/2/L metrics. Compared to baseline approaches, HASolver delivers notable performance gains, improving BLEU and ROUGE-L scores by 14.6% and 19.6%, respectively. Further analyses are carried out to explore various multi-vector configurations and the effect of prompt strategies. We also release the annotated resolution dataset derived from the anomaly-related knowledge base to facilitate future research.

查看原文本刊更多论文

基于大型语言模型的软件系统异构数据驱动的分辨率生成

现代软件系统越来越复杂和动态，使得它们特别容易受到性能异常的影响。尽管运行时异常检测提高了系统的可靠性，但一旦检测到异常日志或度量，工程师仍然要投入大量的时间和精力来解决错误。这些挑战需要智能自动化，能够根据检测到的异常提供有针对性的补救步骤。在这项工作中，我们首先通过将包括日志和度量在内的异构操作数据与领域专家注释的决议相结合，构建了一个与异常相关的知识库。此外，我们提出了HASolver，这是第一个为多源系统异常生成推荐解决方案的异构异常求解器。核心组件是一个双视图多向量模块，旨在表示来自不同模态的异构异常块，并支持有效的多向量检索。HASolver集成了一个带有领域知识的大型语言模型来生成缓解解决方案。我们使用BLEU和ROUGE-1/2/L指标进行了广泛的实验。与基线方法相比，HASolver提供了显著的性能提升，BLEU和ROUGE-L评分分别提高了14.6%和19.6%。进一步分析了各种多向量配置和提示策略的效果。我们还发布了来自异常相关知识库的带注释的分辨率数据集，以方便未来的研究。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Information Processing & Management 工程技术-计算机：信息系统

CiteScore

17.00

自引率

11.60%

发文量

276

审稿时长

39 days

期刊介绍： Information Processing and Management is dedicated to publishing cutting-edge original research at the convergence of computing and information science. Our scope encompasses theory, methods, and applications across various domains, including advertising, business, health, information science, information technology marketing, and social computing. We aim to cater to the interests of both primary researchers and practitioners by offering an effective platform for the timely dissemination of advanced and topical issues in this interdisciplinary field. The journal places particular emphasis on original research articles, research survey articles, research method articles, and articles addressing critical applications of research. Join us in advancing knowledge and innovation at the intersection of computing and information science.