Journal of Systems and Software最新文献

筛选
英文 中文
From description to prescription: Unraveling log severity adjustments in open-source software 从描述到处方:揭示开源软件中的日志严重性调整
IF 4.1 2区 计算机科学
Journal of Systems and Software Pub Date : 2025-09-24 DOI: 10.1016/j.jss.2025.112643
Eduardo Mendes , Marcelo Vasconcellos , Fabio Petrillo , Sylvain Hallé
{"title":"From description to prescription: Unraveling log severity adjustments in open-source software","authors":"Eduardo Mendes ,&nbsp;Marcelo Vasconcellos ,&nbsp;Fabio Petrillo ,&nbsp;Sylvain Hallé","doi":"10.1016/j.jss.2025.112643","DOIUrl":"10.1016/j.jss.2025.112643","url":null,"abstract":"<div><h3>Context:</h3><div>Logs are vital to understanding a software system’s behavior, often being the only evidence available to investigate failures.</div></div><div><h3>Problem:</h3><div>Selecting a Log Severity Level (LSL) can be challenging for the following reasons: (i) the absence of knowledge about how logs are used in production, (ii) the lack of understanding of how critical an event is, and (iii) the lack of practical guidelines. This leads to frequent LSL adjustments during software development and evolution.</div></div><div><h3>Objective:</h3><div>Our goal is to investigate the LSL adjustments between system releases and explore methods to improve LSL classification.</div></div><div><h3>Methods:</h3><div>We analyzed the log statements from different releases of open-source systems, focusing on their LSL adjustments and examining the commit comments to understand the reasons for the adjustments.</div></div><div><h3>Results:</h3><div>Our results show that most adjustments occur at the intersection of development and production environment logs. Furthermore, the main guiding factors for the adjustments are the experience and logging theory. Our contributions are (i) a description of trends and patterns in LSL adjustments and (ii) a set of 24 heuristics to guide the choice, review, and adjustments of LSL. We advise developers to adhere to the LSL purposes, routinely review LSL settings, and remain adaptable to their mutability.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"231 ","pages":"Article 112643"},"PeriodicalIF":4.1,"publicationDate":"2025-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145219804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Do comments and expertise still matter? An experiment on programmers’ adoption of AI-generated JavaScript code 评论和专业知识还重要吗?关于程序员采用ai生成的JavaScript代码的实验
IF 4.1 2区 计算机科学
Journal of Systems and Software Pub Date : 2025-09-23 DOI: 10.1016/j.jss.2025.112634
Changwen Li , Christoph Treude , Ofir Turel
{"title":"Do comments and expertise still matter? An experiment on programmers’ adoption of AI-generated JavaScript code","authors":"Changwen Li ,&nbsp;Christoph Treude ,&nbsp;Ofir Turel","doi":"10.1016/j.jss.2025.112634","DOIUrl":"10.1016/j.jss.2025.112634","url":null,"abstract":"<div><div>This paper investigates the factors influencing programmers’ adoption of AI-generated JavaScript code recommendations within the context of lightweight, function-level programming tasks. It extends prior research by (1) utilizing objective (as opposed to the typically self-reported) measurements for programmers’ adoption of AI-generated code and (2) examining whether AI-generated comments added to code recommendations and development expertise drive AI-generated code adoption. We tested these potential drivers in an online experiment with 173 programmers. Participants were asked to answer some questions to demonstrate their level of development expertise. Then, they were asked to solve a LeetCode problem without AI support. After attempting to solve the problem on their own, they received an AI-generated solution to assist them in refining their solutions. The solutions provided were manipulated to include or exclude AI-generated comments (a between-subjects factor). Programmers’ adoption of AI-generated code was gauged by code similarity between AI-generated solutions and participants’ submitted solutions, providing a behavioral measurement of code adoption behaviors. Our findings revealed that, within the context of function-level programming tasks, the presence of comments significantly influences programmers’ adoption of AI-generated code regardless of the participants’ development expertise.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"231 ","pages":"Article 112634"},"PeriodicalIF":4.1,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145157459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TwinArch: A digital twin reference architecture TwinArch:一个数字双胞胎参考架构
IF 4.1 2区 计算机科学
Journal of Systems and Software Pub Date : 2025-09-22 DOI: 10.1016/j.jss.2025.112613
Alessandra Somma , Domenico Amalfitano , Alessandra De Benedictis , Patrizio Pelliccione
{"title":"TwinArch: A digital twin reference architecture","authors":"Alessandra Somma ,&nbsp;Domenico Amalfitano ,&nbsp;Alessandra De Benedictis ,&nbsp;Patrizio Pelliccione","doi":"10.1016/j.jss.2025.112613","DOIUrl":"10.1016/j.jss.2025.112613","url":null,"abstract":"<div><h3>Background:</h3><div>Digital Twins (DTs) are dynamic virtual representations of physical systems, enabled by seamless, bidirectional communication between the physical and digital realms. Among the challenges impeding the widespread adoption of DTs is the absence of a universally accepted definition and a standardized DT Reference Architecture (RA). Existing state-of-the-art architectures remain largely domain-specific, and primarily emphasize aspects like modeling and simulation. Furthermore, they often combine structural and dynamic elements into unified, all-in-one diagrams, which adds to the ambiguity and confusion surrounding the concept of Digital Twins.</div></div><div><h3>Objective:</h3><div>To address these challenges, this work aims to contribute a domain-independent, multi-view <em>Digital Twin Reference Architecture</em> that can help practitioners in architecting and engineering their DTs.</div></div><div><h3>Method:</h3><div>We adopted the <em>design science</em> methodology, structured into three cycles: <em>(i)</em> an initial investigation conducting a Systematic Literature Review to identify key architectural elements, <em>(ii)</em> preliminary design refined via feedback from practitioners, and <em>(iii)</em> final artifact development, integrating knowledge from widely adopted DT development platforms and validated through an expert survey of 20 participants.</div></div><div><h3>Results:</h3><div>The proposed Digital Twin Reference Architecture is named <strong>TwinArch</strong>. It is documented using the <em>Views and Beyond</em> methodology by the Software Engineering Institute. TwinArch website and replication package: <span><span>https://alessandrasomma28.github.io/twinarch/</span><svg><path></path></svg></span>.</div></div><div><h3>Conclusion:</h3><div>TwinArch offers practitioners UML models that can be utilized for designing and developing new DT systems across various domains. It enables customization and tailoring to specific use cases while also supporting the documentation of existing DT systems.</div><div><em>Editor’s note: Open Science material was validated by the Journal of Systems and Software Open Science Board</em>.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"231 ","pages":"Article 112613"},"PeriodicalIF":4.1,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145219948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The fire tries gold: Evaluating pre-trained language models for multi-label vulnerability detection in ethereum smart contracts 火试金:评估以太坊智能合约中用于多标签漏洞检测的预训练语言模型
IF 4.1 2区 计算机科学
Journal of Systems and Software Pub Date : 2025-09-22 DOI: 10.1016/j.jss.2025.112642
Trung Kien Luu, Doan Minh Trung, Tuan-Dung Tran, Phan The Duy, Van-Hau Pham
{"title":"The fire tries gold: Evaluating pre-trained language models for multi-label vulnerability detection in ethereum smart contracts","authors":"Trung Kien Luu,&nbsp;Doan Minh Trung,&nbsp;Tuan-Dung Tran,&nbsp;Phan The Duy,&nbsp;Van-Hau Pham","doi":"10.1016/j.jss.2025.112642","DOIUrl":"10.1016/j.jss.2025.112642","url":null,"abstract":"<div><div>Smart contracts are integral components of blockchain ecosystems, yet they remain highly susceptible to security vulnerabilities that can lead to severe financial and operational consequences. To address this, a range of vulnerability detection techniques have been developed, including rule-based tools, neural network models, pre-trained language models (PLMs), and most recently, large language models (LLMs). However, those existing methods face three main limitations: (1) Rule-based tools such as Slither and Oyente depend heavily on handcrafted heuristics, requiring human intervention and high execution time. (2) LLM-based approaches are computationally expensive and challenging to fine-tune in resource-constrained environments, particularly within academic or research settings where access to high-performance computing is constrained. (3) Most existing approaches focus on binary and multi-class classification, assuming that each contract contains only a single vulnerability, whereas in practice, smart contracts often exhibit multiple coexisting vulnerabilities that require a multi-label detection approach. In this study, we conduct a comprehensive benchmark that systematically evaluates the effectiveness of traditional deep learning models (e.g., LSTM, BiLSTM) versus state-of-the-art PLMs (e.g., CodeBERT, GraphCodeBERT) in multi-label vulnerability detection. Our dataset comprises nearly 18,000 real-world smart contracts annotated with seven distinct vulnerability types. We evaluate not only detection accuracy but also computational efficiency, including training time, inference speed, and resource consumption. Our findings reveal a crucial trade-off: while code-specialized PLMs like GraphCodeBERT achieve a high F1-score of 96%, a well-tuned BiLSTM with an attention mechanism surpasses it (98% F1-score) with significantly less training time. By providing a clear, evidence-based framework, this research offers practical recommendations for engineers to select the most appropriate model, balancing state-of-the-art performance with the resource constraints inherent in real-world security tools.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"231 ","pages":"Article 112642"},"PeriodicalIF":4.1,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145158074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Tailoring binary decision diagram compilation for feature models 裁剪特征模型的二元决策图编译
IF 4.1 2区 计算机科学
Journal of Systems and Software Pub Date : 2025-09-20 DOI: 10.1016/j.jss.2025.112566
Clemens Dubslaff , Nils Husung , Nikolai Käfer
{"title":"Tailoring binary decision diagram compilation for feature models","authors":"Clemens Dubslaff ,&nbsp;Nils Husung ,&nbsp;Nikolai Käfer","doi":"10.1016/j.jss.2025.112566","DOIUrl":"10.1016/j.jss.2025.112566","url":null,"abstract":"<div><div>The compilation of feature models into <em>binary decision diagrams (BDDs)</em> is a major challenge in the area of configurable systems analysis. For many large-scale feature models such as the variants of the prominent Linux product line, BDDs could not yet be obtained due to exceeding state-of-the-art compilation capabilities. Until now, BDD compilation has been mainly considered on standard settings of existing BDD tools, barely exploiting advanced techniques or tuning parameters.</div><div>In this article, we conduct a comprehensive study on how to configure various techniques from the literature and thus improve compilation performance for feature models given in conjunctive normal form. Specifically, we evaluate preprocessing for <em>satisfiability solving (SAT)</em>, variable and clause ordering heuristics, as well as non-standard and multi-threaded BDD construction schemes. Our experiments on recent feature models demonstrate that BDD compilation of feature models greatly benefits from these techniques. We show that our methods enable BDD compilations of many large-scale feature models within seconds, including the whole <span>eCos</span> feature model collection for which a compilation was previously infeasible.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"231 ","pages":"Article 112566"},"PeriodicalIF":4.1,"publicationDate":"2025-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145219803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Leveraging LLM-based data augmentation for automatic classification of recurring tasks in software development projects 利用基于llm的数据增强对软件开发项目中重复出现的任务进行自动分类
IF 4.1 2区 计算机科学
Journal of Systems and Software Pub Date : 2025-09-19 DOI: 10.1016/j.jss.2025.112641
Włodzimierz Wysocki , Mirosław Ochodek
{"title":"Leveraging LLM-based data augmentation for automatic classification of recurring tasks in software development projects","authors":"Włodzimierz Wysocki ,&nbsp;Mirosław Ochodek","doi":"10.1016/j.jss.2025.112641","DOIUrl":"10.1016/j.jss.2025.112641","url":null,"abstract":"<div><h3>Background:</h3><div>Issue tracking systems (ITS) store project task data that is valuable for analytics and simulation. Projects typically include two types of tasks: stateful and recurring. While stateful tasks can be automatically categorized with relative ease, categorizing recurring tasks remains challenging. Prior research indicates that a key difficulty may lie in the underrepresentation of certain task types, which leads to severely imbalanced training datasets and hampers the accuracy of machine-learning models for task categorization.</div></div><div><h3>Aims:</h3><div>The goal of this study is to evaluate whether leveraging large language models (LLM) for data augmentation can enhance the machine-learning-based categorization of recurring tasks in software projects.</div></div><div><h3>Method:</h3><div>We conduct our study on a dataset from six industrial projects comprising 9,589 tasks. To address class imbalance, we up-sample minority classes during training via data augmentation using LLMs and several prompting strategies, assessing their impact on prediction quality. For each project, we perform time-series 5-fold cross-validation and evaluate the classifiers using state-of-the-art metrics — Accuracy, Precision, Recall, F1-score, and MCC — as well as practice-inspired metric called Monthly Classification Error (MCE) that assess the impact of task misclassification on project planning and resource allocation. Our machine-learning pipeline employs Transformer-based sentence embeddings and XGBoost classifiers.</div></div><div><h3>Results:</h3><div>The model automatically classifies software process tasks into 14 classes, achieving MCC values between 0.71 and 0.76. We observed higher prediction quality for the largest projects in the dataset and for those managed using “traditional” project management methodologies. Moreover, employing intra-project data augmentation strategies reduced the MCE error by up to 43%.</div></div><div><h3>Conclusions:</h3><div>Our findings indicate that large language models (LLMs) can be used to mitigate the impact of imbalanced task categories, thereby enhancing the performance of classification models even with limited training data.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"231 ","pages":"Article 112641"},"PeriodicalIF":4.1,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145118317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Identification and classification of free, open source software licenses: A systematic literature review 自由、开源软件许可证的识别和分类:系统的文献综述
IF 4.1 2区 计算机科学
Journal of Systems and Software Pub Date : 2025-09-19 DOI: 10.1016/j.jss.2025.112628
Sergio Montes-Leon, Gregorio Robles, Jesus M. Gonzalez-Barahona
{"title":"Identification and classification of free, open source software licenses: A systematic literature review","authors":"Sergio Montes-Leon,&nbsp;Gregorio Robles,&nbsp;Jesus M. Gonzalez-Barahona","doi":"10.1016/j.jss.2025.112628","DOIUrl":"10.1016/j.jss.2025.112628","url":null,"abstract":"<div><h3>Background:</h3><div>Licenses are a fundamental element of free, open source software (FOSS), since they express the permissions granted to those receiving the software. Therefore, identification of licenses in source code, and their analysis, is crucial to understand the legal implications of using and distributing FOSS. This has not been ignored by many researchers, who have devoted attention to the topic of license identification and analysis.</div></div><div><h3>Goal:</h3><div>To learn how researchers have identified and classified licenses in FOSS, including which techniques and tools they have used. We were also interested in the evolution of these techniques and tools over time, and the public datasets available in this realm.</div></div><div><h3>Method:</h3><div>We conducted a Systematic Literature Review, which resulted in 50 scientific publications which we analyzed.</div></div><div><h3>Results:</h3><div>We observed that most studies focus on the use or development of specific tools. However, there is a recurring concern about the need to improve these tools, and the techniques they use. Studies presented (and therefore, tools and techniques presented) are usually empirically validated. With respect to techniques, we found that the use of machine-learning techniques is still relatively scarce, with most papers presenting studies based on pattern matching and similar techniques. It is also interesting that reuse of tools is relatively high, and that many of these tools remain available. However, benchmarking studies highlight some specific tools, which, perhaps for that reason, are becoming more common in publications. The availability of datasets oriented towards license identification is limited, but very large datasets have been published during the last years.</div></div><div><h3>Conclusions:</h3><div>Data scarcity and a reliance on existing tools pose significant challenges for this research area. The relatively low use of machine learning techniques, and the scarcity of studies related to the classification of license texts open interesting opportunities for research, which is facilitated by the recent availability of large datasets. Additionally, researchers can also benefit from readily available tools for tasks like comparison and benchmarking.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"231 ","pages":"Article 112628"},"PeriodicalIF":4.1,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145118316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Access Granted – Carefully: Securing model information in collaborative modeling 授予访问-小心:在协作建模中保护模型信息
IF 4.1 2区 计算机科学
Journal of Systems and Software Pub Date : 2025-09-19 DOI: 10.1016/j.jss.2025.112640
Malvina Latifaj, Federico Ciccozzi, Antonio Cicchetti
{"title":"Access Granted – Carefully: Securing model information in collaborative modeling","authors":"Malvina Latifaj,&nbsp;Federico Ciccozzi,&nbsp;Antonio Cicchetti","doi":"10.1016/j.jss.2025.112640","DOIUrl":"10.1016/j.jss.2025.112640","url":null,"abstract":"<div><div>The collaborative nature of model-driven software engineering introduces significant challenges in safeguarding the confidentiality and integrity of the collaborative model. Existing access control mechanisms often rely on transient, virtual views lacking persistence and fine-grained permissions, making them unsuitable for scenarios requiring offline collaboration and leading to potential security breaches and user frustration. This work describes a dual-layered approach leveraging role-based access control policies to enhance security in collaborative modeling environments. The first layer utilizes multi-view modeling techniques to create materialized view models tailored to specific user roles, thereby restricting unnecessary access to the entire model. The second layer refines access at the individual element level within these view models, establishing fine-grained permissions enforced by model editors. This proactive enforcement prevents unauthorized actions before they occur, improving user experience and efficiency. The proposed approach, implemented as an Eclipse plugin and demonstrated through an illustrative example, ensures the confidentiality and integrity of shared model data by granting stakeholders access only to information relevant to their specific responsibilities and expertise. By filtering out irrelevant data, the approach also mitigates information overload, enabling stakeholders to concentrate on task-relevant aspects of the model, thereby potentially improving collaborative efficiency and effectiveness.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"231 ","pages":"Article 112640"},"PeriodicalIF":4.1,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145219802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MC-LRNN: A logic-based neural network for multi-class software vulnerability prediction MC-LRNN:一种基于逻辑的多类软件漏洞预测神经网络
IF 4.1 2区 计算机科学
Journal of Systems and Software Pub Date : 2025-09-18 DOI: 10.1016/j.jss.2025.112627
Yuxiang Shang , Shaoying Liu
{"title":"MC-LRNN: A logic-based neural network for multi-class software vulnerability prediction","authors":"Yuxiang Shang ,&nbsp;Shaoying Liu","doi":"10.1016/j.jss.2025.112627","DOIUrl":"10.1016/j.jss.2025.112627","url":null,"abstract":"<div><div>Software vulnerabilities are a major threat to information systems. Detecting them early and accurately is critical. Software metrics are commonly used in vulnerability prediction, but choosing the most relevant features remains a major challenge. In this paper, we present Multi-Class Logic Rules Neural Network (MC-LRNN), a novel model that combines logic-based reasoning with neural networks for software vulnerability prediction. MC-LRNN uses a Top-Down Hill-Climbing Greedy Algorithm to extract first-order logic rules from software metrics, forming an interpretable reasoning layer that guides the learning process. The dataset is divided into a Logic Rule Dataset for rule generation and a Learning Dataset for model training and evaluation.</div><div>We evaluate MC-LRNN on three benchmark datasets — Juliet, SARD, and REVEAL — under both binary and multi-class classification settings. The results show that MC-LRNN consistently outperforms traditional baselines, handles class imbalance effectively, and generalizes well across projects. Its design provides both interpretability and strong generalization capabilities, making it well-suited for real-world vulnerability prediction. Code and datasets are available at: <span><span>https://github.com/Seansyx123/LRNN</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"231 ","pages":"Article 112627"},"PeriodicalIF":4.1,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145105389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multivariate anomaly detection and root cause analysis of energy issues in microservice-based systems 基于微服务的系统中能源问题的多元异常检测和根本原因分析
IF 4.1 2区 计算机科学
Journal of Systems and Software Pub Date : 2025-09-18 DOI: 10.1016/j.jss.2025.112626
Berta Rodriguez Sanchez , Luca Giamattei , Antonio Guerriero , Roberto Pietrantuono , Ivano Malavolta
{"title":"Multivariate anomaly detection and root cause analysis of energy issues in microservice-based systems","authors":"Berta Rodriguez Sanchez ,&nbsp;Luca Giamattei ,&nbsp;Antonio Guerriero ,&nbsp;Roberto Pietrantuono ,&nbsp;Ivano Malavolta","doi":"10.1016/j.jss.2025.112626","DOIUrl":"10.1016/j.jss.2025.112626","url":null,"abstract":"<div><h3>Context:</h3><div>Microservice-based systems have become the architecture style of choice for modern applications, offering scalability, flexibility, and resilience. However, their distributed nature leads to increased resource consumption and energy inefficiencies, posing challenges for maintaining sustainable operations. Accurate anomaly detection (AD) and root cause analysis (RCA) tools are critical for diagnosing energy consumption issues in these systems, yet existing solutions often lack focus on energy metrics.</div></div><div><h3>Goal:</h3><div>This study aims to evaluate the effectiveness of AD and RCA algorithms in identifying and diagnosing performance-related energy consumption anomalies in microservice-based systems.</div></div><div><h3>Method:</h3><div>Two representative systems, Sock Shop and Train Ticket, are deployed under controlled environments. Then, anomalies are deliberately introduced by stressing at the same time CPU, memory, and disk resources. The data collection is conducted using Prometheus for performance metrics and Scaphandre for energy metrics. Once normal and anomalous datasets are constructed for each system, the study evaluates five AD algorithms (Birch, iForest, KNN, LOF, and SVM) and four RCA algorithms (MicroRCA, CausalRCA, CIRCA, and RCD) based on their precision, recall, and scalability across varied scenarios and workloads.</div></div><div><h3>Results:</h3><div>The experiment reveals that overall, iForest is the most effective AD algorithms in detecting energy anomalies (0.59 F-Score in Sock Shop and 0.634 F-Score in Train Ticket). In particular, iForest performs better in precision when the user load is high (1000 concurrent users). For RCA, CIRCA performs well in identifying root causes in smaller systems, while RCD is more scalable for larger and more complex systems.</div></div><div><h3>Conclusions:</h3><div>The findings of this study provide insights for both researchers and practitioners. In the context of our experiment, AD algorithms tend to perform relatively well, whereas RCA algorithms tend to be imprecise in localizing energy issues.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"231 ","pages":"Article 112626"},"PeriodicalIF":4.1,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145105390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信