{"title":"CSAT: Configuration structure-aware tuning for highly configurable software systems","authors":"Yufei Li , Liang Bao , Kaipeng Huang , Chase Wu","doi":"10.1016/j.jss.2024.112316","DOIUrl":"10.1016/j.jss.2024.112316","url":null,"abstract":"<div><div>Many modern software systems provide numerous configuration options with a large parameter space that users can adjust for specific running environments. However, configuring such systems always incurs an undue burden on users due to the lack of domain knowledge to understand complex interactions between the performance and the parameters. To address this issue, various tuning techniques have been developed to automatically determine the optimal configuration by either directly searching the configuration space or learning a surrogate model to guide the exploration process. Most previous studies only apply simple search strategies to explore the complex configuration space, which often leads to fruitless attempts in suboptimal areas. Inspired by previous studies, we define configuration structures to describe the positions of various configurations in the performance space of software systems. This idea leads to the design of a novel Configuration Structure-Aware Tuning (CSAT) algorithm. CSAT constructs a structure model for system configurations using the framework of Adaptive Network-based Fuzzy Inference System (ANFIS), learns a comparison-based distribution model through Gaussian Process Regression (GPR), and uses Bayesian Inference to generate potentially promising configurations based on the structure. The experimental results demonstrate that in terms of tuning performance, on average, CSAT outperforms default configurations by 65.51% and outperforms six state-of-the-art tuning algorithms by 22.10%–33.20%. In terms of handling internal constraints, CSAT achieves an average probability of 0.767 in generating valid configurations.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"222 ","pages":"Article 112316"},"PeriodicalIF":3.7,"publicationDate":"2024-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143104100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Atlas, a modular and efficient open-source BFT framework","authors":"Nuno Neto , Rolando Martins , Luís Veiga","doi":"10.1016/j.jss.2024.112317","DOIUrl":"10.1016/j.jss.2024.112317","url":null,"abstract":"<div><div>Over the last few decades, a large body of research was carried out covering Byzantine Fault Tolerance (BFT) systems. This research has brought forward new techniques, including but not limited, for ordering operations (Abraham et al., 2018; Buchman, 2016; Guo et al., 2020; Bessani et al., 2014; Duan et al., 2018) and state transfer (Bessani et al., 2013; <span><span>Distler, 2021</span></span>, <span><span>Eischer et al., 2019</span></span>), on networks that suffer from byzantine faults. More recently, the ongoing research on distributed ledgers re-ignited the interest on BFT, due to its high throughput when compared to other alternatives of byzantine consensus (<span><span>Vukolić, 2016</span></span>).</div><div>In this paper we present three contributions covering several aspects, including modular and extensible framework design and implementation, system optimization through development of better networking alternatives, a greater use of parallelism, several ordering protocol improvements and extensive comparative assessment of previous state-of-the-art approaches.</div><div>First, we introduce Atlas, an open-source modular BFT framework that aims to support the research and development of highly efficient BFT protocols, by decoupling traditionally entangled sub-protocols, e.g., consensus primitive from the execution (Bessani et al., 2014), and deferment of log management to replicated services from state transfer. Atlas allows to further provide modules that can be re-used across different BFT approaches, such as deterministic and probabilistic/randomized models.</div><div>Second, we present FeBFT, a new BFT implementation developed upon Atlas that combines pre-existing proven ideas from PBFTs, namely its 3-phase consensus and view-change protocol. This base approach is then extended with novel optimizations of the protocol, namely, multi-leader proposals (Stathakopoulou et al., 2019), multi-instance consensus execution (Stathakopoulou et al., 2022; Behl et al., 2015), and configurable batching solution that allow us to reduce the latency while improving throughput at the same time.</div><div>Third, we offer a comprehensive evaluation amongst our work and other state-of-the-art BFT-SMR implementations, namely, Atlas (<span><span>Neto et al., 2024a</span></span>) with FeBFT (Official febft repository 2024), BFT-SMaRt (Bessani et al., 2014) and Themis (Rüsch et al., 2019).</div><div>With these contributions, we aim to lay the ground work to: (i) improve reusability and hence productivity in BFT(-SMR) development; (ii) increase system safety, performance, scalability and reduce recovery time with the optimizations proposed; (iii) draw insights on the bottlenecks preventing order-of-magnitude improvements in BFT processing from a system’s perspective; and lastly, (iv) improve reproducibility between different BFT (sub-)protocols by allowing for true apples-to-apples comparisons.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"222 ","pages":"Article 112317"},"PeriodicalIF":3.7,"publicationDate":"2024-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143103984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sabato Nocera, Simone Romano, Rita Francese, Giuseppe Scanniello
{"title":"Software engineering education: Results from a training intervention based on SonarCloud when developing web apps","authors":"Sabato Nocera, Simone Romano, Rita Francese, Giuseppe Scanniello","doi":"10.1016/j.jss.2024.112308","DOIUrl":"10.1016/j.jss.2024.112308","url":null,"abstract":"<div><div>Past research suggests that Computer Science (CS) undergraduate students are not equipped to manage quality characteristics such as security, reliability, and maintainability. Filling such a gap should allow CS undergraduates an easier integration into the labor market after graduation. To make students more ready for such a market, we introduced a training intervention in our <em>Software Technologies for the Web</em> (<em>STW</em> ) course in the academic year (a.y.) 2022–23. Our intervention focused on security, <em>i.e.,</em> students were trained on secure development and were asked to use <em>SonarCloud</em>. To assess this intervention, we compared the web apps developed in a.y. 2021–22 and a.y. 2022–23 and observed that the security significantly improved in the a.y. 2022–23 web apps. To understand whether and to what extent our training intervention triggered autonomous motivation in the students (a.y. 2022–23) on reliability and maintainability, we also compared the web apps of a.y. 2021–22 and a.y. 2022–23 on these issues. To that end, we did not ask students to deal with reliability and maintainability. This part of our research is presented in this paper for the first time and revealed that the web apps of a.y. 2022–23 are more reliable and maintainable than those of a.y. 2021–22.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"222 ","pages":"Article 112308"},"PeriodicalIF":3.7,"publicationDate":"2024-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143104099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dear researchers step 1: Find a team with a problem","authors":"Eoin Woods","doi":"10.1016/j.jss.2024.112318","DOIUrl":"10.1016/j.jss.2024.112318","url":null,"abstract":"","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"222 ","pages":"Article 112318"},"PeriodicalIF":3.7,"publicationDate":"2024-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143104117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Riccardo Coppola , Robert Feldt , Michel Nass , Emil Alégroth
{"title":"Ranking approaches for similarity-based web element location","authors":"Riccardo Coppola , Robert Feldt , Michel Nass , Emil Alégroth","doi":"10.1016/j.jss.2024.112286","DOIUrl":"10.1016/j.jss.2024.112286","url":null,"abstract":"<div><h3>Context:</h3><div>GUI-based tests for web applications are frequently broken by fragility, i.e. regression tests fail due to changing properties of the web elements. The most influential factor for fragility are the locators used in the scripts, i.e. the means of identifying the elements of the GUI.</div></div><div><h3>Objective:</h3><div>We extend a state-of-the-art Multi-Locator solution that considers 14 locators from the DOM model of a web application, and identifies overlapping nodes in the DOM tree (VON-Similo). We augment the approach with standard Machine Learning and Learning to Rank (LTR) approaches to aid the location of web elements.</div></div><div><h3>Method:</h3><div>We document an experiment with a ground truth of 1163 web element pairs, taken from different releases of 40 web applications, to compare the robustness of the algorithms to locator weight change, and the performance of LTR approaches in terms of MeanRank and PctAtN.</div></div><div><h3>Results:</h3><div>Using LTR algorithms, we obtain a maximum probability of finding the correct target at the first position of 88.4% (lowest 82.57%), and among the first three positions of 94.79% (lowest 91.86%). The best mean rank of the correct candidate is 1.57.</div></div><div><h3>Conclusion:</h3><div>The similarity-based approach proved to be highly dependable in the context of web application testing, where a low percentage of matching errors can still be accepted.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"222 ","pages":"Article 112286"},"PeriodicalIF":3.7,"publicationDate":"2024-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143103985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jingwei Ye , Chunbo Liu , Zhaojun Gu , Zhikai Zhang , Xuying Meng , Weiyao Zhang , Yujun Zhang
{"title":"LogOW: A semi-supervised log anomaly detection model in open-world setting","authors":"Jingwei Ye , Chunbo Liu , Zhaojun Gu , Zhikai Zhang , Xuying Meng , Weiyao Zhang , Yujun Zhang","doi":"10.1016/j.jss.2024.112305","DOIUrl":"10.1016/j.jss.2024.112305","url":null,"abstract":"<div><div>Log anomaly detection is a method for finding abnormal behavior and faults in systems. However, existing methods face two main challenges: the open-world problem and the cold-start problem. The open-world problem means that the test set may contain new classes that are not in the training set, while the cold-start problem means that the initial training data are scarce, both for normal and abnormal log sequences. Most existing methods assume a closed-world setting and rely on sufficient normal data, which limits their adaptability to new log environments.</div><div>We propose LogOW, a novel log anomaly detection model that can learn from a few normal log sequences. The model finds emerging normal log sequences in the open-world setting through the <strong>open-world sample retrieval</strong> module. Through the <strong>incremental pre-training</strong> module, these log sequences are fine-tuned in an online mode for model parameters.</div><div>First, we train a basic model from normal log sequences using Masked-Language Modeling(MLM). During the testing phase, we then combine the anomaly score and the uncertainty score obtained through a novel dynamic multi-mask to distinguish closed-world normal log sequences from the test set. Next, we cluster the open-world log sequences based on fused sequence and count features, and identify the abnormal ones and the new normal ones. Finally, we update our model with the new normal sequences in the next time period. Experiments on three log datasets and real-world airport logs show that our model outperforms traditional models in the open-world and lack of training data setting.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"222 ","pages":"Article 112305"},"PeriodicalIF":3.7,"publicationDate":"2024-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143104091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Data race detection via few-shot parameter-efficient fine-tuning","authors":"Yuanyuan Shen , Manman Peng , Fan Zhang , Qiang Wu","doi":"10.1016/j.jss.2024.112289","DOIUrl":"10.1016/j.jss.2024.112289","url":null,"abstract":"<div><div>The OpenMP programming model is playing an increasing role in parallelization on shared-memory systems owing to its simplicity of operation and portability. OpenMP provides the semantic equivalent of a parallel program for the original sequential program. Though it is easier to write parallel programs using OpenMP, writing them correctly is a challenge. Data race conditions errors can easily occur during the writing process, particularly by inexperienced programmers. Some data race checkers have been developed to help programmers check for data race in parallel programs. However, several of them have constraints on the input and thread configuration, time overhead, and scope of program analysis. In this study, we target data race detection in OpenMP parallel programs to address the issues of constraints from checkers. We propose a few-shot parameter-efficient fine-tuning method using adapter module to address data race detection issue. The proposed method does not require a large labeled dataset, and it makes data efficient. A generic dataset is constructed with a limited number of labeled data, containing diverse OpenMP patterns for data race detection. A neural architecture search approach is employed to improve the performance of detection. The experimental results on the generated and open-source datasets demonstrate that our method is effective and improves race detection compared with traditional methods.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"222 ","pages":"Article 112289"},"PeriodicalIF":3.7,"publicationDate":"2024-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143103989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Delfina Ramos-Vidal , Wesley K.G. Assunção , Alejandro Cortiñas , Miguel R. Luaces , Oscar Pedreira , Ángeles Saavedra Places
{"title":"SPL-DB-Sync: Seamless database transformation during feature-driven changes","authors":"Delfina Ramos-Vidal , Wesley K.G. Assunção , Alejandro Cortiñas , Miguel R. Luaces , Oscar Pedreira , Ángeles Saavedra Places","doi":"10.1016/j.jss.2024.112285","DOIUrl":"10.1016/j.jss.2024.112285","url":null,"abstract":"<div><div>Software Product Line (SPL) Engineering is a reuse-oriented approach to developing a suite of software products that share common components but vary in specific features. The advantages of SPLs (e.g., reducing development costs and time while improving quality) have already been proven in practice. However, despite the success in deriving new products from an SPL, challenges arise in evolving existing products. Altering the feature selection (e.g., adding or removing a feature) for an already existing product poses a challenge regarding the application data stored and managed by derived products, particularly when the features impact an already populated database. In many cases, these modifications imply loss of data or constraint violations. However, in both the state of the art and practice, there are no approaches to support feature and data evolution simultaneously for SPL products.</div><div>This paper reports a novel evolution approach, SPL-DB-Sync, with actions required for database adjustments when adding or removing features for existing SPL products. Actions delineate modifications necessary within the database. These modifications are associated with the SPL features and linked to the components of the data model they influence. SPL-DB-Sync facilitates the automatic readjustment of the database while preserving clear traceability between features and elements of the data model. The applicability of our evolution model is detailed in four practical scenarios of in-production products of an SPL for Digital Libraries. The contributions of this work are: present a novel evolution approach for SPLs with databases; define an SPL Evolution Model considering data transformation/migration; advance the state of practice between software reuse and data management; and provide insights for practitioners that face the same challenges of evolving both business logic and its data in software products.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"222 ","pages":"Article 112285"},"PeriodicalIF":3.7,"publicationDate":"2024-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143103990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Olivia Poy, Ma Ángeles Moraga, Félix García, Coral Calero
{"title":"Impact on energy consumption of design patterns, code smells and refactoring techniques: A systematic mapping study","authors":"Olivia Poy, Ma Ángeles Moraga, Félix García, Coral Calero","doi":"10.1016/j.jss.2024.112303","DOIUrl":"10.1016/j.jss.2024.112303","url":null,"abstract":"<div><div>Software energy efficiency is an increasingly relevant aspect that should be taken into account during software development and some of the most common design and coding decisions include the use of design patterns, the removal of code smells and the use of refactoring techniques. Therefore, the aim of this systematic mapping study is to provide an overview on the impact that design patterns, code smells, and refactoring techniques have on the energy consumption of software. This may assist practitioners in developing energy-efficient software as well as the research community in undertaking further research on the topic. The results of the primary studies showed that design patterns, code smells and refactoring techniques impact software energy consumption. However, not all of them have a clear positive or negative effect, which requires further study. Overall, the use of design patterns seems to have a negative impact on energy consumption, the removal of code smells tends to have a more positive impact. We found no conclusive result on the relationship between using refactoring techniques and energy consumption.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"222 ","pages":"Article 112303"},"PeriodicalIF":3.7,"publicationDate":"2024-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143103988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kevin Maggi, Roberto Verdecchia, Leonardo Scommegna, Enrico Vicario
{"title":"Evolution of code technical debt in microservices architectures","authors":"Kevin Maggi, Roberto Verdecchia, Leonardo Scommegna, Enrico Vicario","doi":"10.1016/j.jss.2024.112301","DOIUrl":"10.1016/j.jss.2024.112301","url":null,"abstract":"<div><h3>Context:</h3><div>Microservices are gaining significant traction in academic research and industry due to their advantages, and technical debt has long been a heavily researched metric in software quality context. However, to date, no study has attempted to understand how code technical debt evolves in such architectures.</div></div><div><h3>Aim:</h3><div>This research aims to understand how technical debt evolves over time in microservice architectures by investigating its trends, patterns, and potential relations with microservices number.</div></div><div><h3>Method:</h3><div>We analyze the technical debt evolution of 13 open-source projects. We collect data from systems through automated source code analysis, statistically analyze results to identify technical debt trends and correlations with microservices number, and conduct a subsequent manual commit inspection.</div></div><div><h3>Results:</h3><div>Technical debt increases over time, with periods of stability. The growth is related to microservices number, but its rate is not. The analysis revealed trend differences during initial development phases and later stages. Different activities can introduce technical debt, while its removal relies mainly on refactoring.</div></div><div><h3>Conclusions:</h3><div>Microservices independence is fundamental to maintain the technical debt under control, keeping it compartmentalized. The findings underscore the importance of technical debt management strategies to support the long-term success of microservices.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"222 ","pages":"Article 112301"},"PeriodicalIF":3.7,"publicationDate":"2024-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143103986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}