Giuseppe De Palma , Saverio Giallorenzo , Jacopo Mauro , Matteo Trentin , Gianluigi Zavattaro
{"title":"tAPP OpenWhisk: A serverless platform for topology-aware allocation priority policies","authors":"Giuseppe De Palma , Saverio Giallorenzo , Jacopo Mauro , Matteo Trentin , Gianluigi Zavattaro","doi":"10.1016/j.scico.2025.103349","DOIUrl":"10.1016/j.scico.2025.103349","url":null,"abstract":"<div><div>The Function-as-a-Service (FaaS) paradigm offers a serverless approach that abstracts the management of underlying infrastructure, enabling developers to focus on application logic. However, leveraging infrastructure-aware features can further optimize serverless performance.</div><div>We present a software prototype that enhances Apache OpenWhisk serverless platform with a novel architecture incorporating tAPP (topology-aware Allocation Priority Policies), a declarative language designed for specifying topology-aware scheduling policies. Through a case study involving distributed data access across multiple cloud regions, we show that tAPP can significantly reduce latency and minimizes performance variability compared to the standard OpenWhisk implementation.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"247 ","pages":"Article 103349"},"PeriodicalIF":1.5,"publicationDate":"2025-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144239371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Evaluating a continuous feedback strategy to enhance machine learning code smell detection","authors":"Daniel Cruz, Amanda Santana, Eduardo Figueiredo","doi":"10.1016/j.scico.2025.103346","DOIUrl":"10.1016/j.scico.2025.103346","url":null,"abstract":"<div><div>Code smells are symptoms of bad design choices implemented on the source code. Several code smell detection tools and strategies have been proposed over the years, including the use of machine learning algorithms. However, we lack empirical evidence on how expert feedback could improve machine learning based detection of code smells. This paper aims to propose and evaluate a conceptual strategy to improve machine-learning detection of code smells by means of continuous feedback. To evaluate the strategy, we follow an exploratory evaluation design to compare results of the smell detection before and after feedback provided by a service - acting as a software expert. We focus on four code smells - God Class, Long Method, Feature Envy, and Refused Bequest - detected in 20 Java systems. As results, we observed that continuous feedback improves the performance of code smell detection. For the detection of the class-level code smells, God Class and Refused Bequest, we achieved an average improvement in terms of F1 of 0.13 and 0.58, respectively, after 50 iterations of feedback. For the method-level code smells, Long Method and Feature Envy, the improvements of F1 were 0.66 and 0.72, respectively. Our promising results are a stepping stone towards the development of new strategies and tools relying on continuous feedback for machine learning detection of code smells.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"247 ","pages":"Article 103346"},"PeriodicalIF":1.5,"publicationDate":"2025-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144239369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Earley table traversing parsers","authors":"Elizabeth Scott, Adrian Johnstone","doi":"10.1016/j.scico.2025.103335","DOIUrl":"10.1016/j.scico.2025.103335","url":null,"abstract":"<div><div>We present a version of Earley's general parsing algorithm which uses a precomputed table. Our algorithm generates a set based representation of sentence derivations, precomputed components of which are also held in the table. We give experimental results for Java and ANSI C showing that the data structures produced are considerably smaller than the corresponding Earley data structures, and that the algorithm runs faster. The algorithm retains the simplicity of Earley's approach and, without explanatory discussion, takes only about a page to fully specify. This paper contains both motivational discussion, describing a recogniser version of the algorithm first and then its extension to a parser, and a concise, bare, but complete parser specification.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"247 ","pages":"Article 103335"},"PeriodicalIF":1.5,"publicationDate":"2025-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144203431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Interleaving semantics and verification of UML 2 dynamic interactions using process algebra","authors":"Aissam Belghiat","doi":"10.1016/j.scico.2025.103334","DOIUrl":"10.1016/j.scico.2025.103334","url":null,"abstract":"<div><div>UML sequence diagrams provide a visual notation for modeling the behavior of object interactions in systems. They lack precise formal semantics due to the semi-formal nature of the UML language which hinders their automated analysis and verification. Process algebras have been widely used in the literature in order to deal with such problems. <em>π</em>-calculus is a well-known process algebra recognized for its rich theoretical foundation and high expressivity power. It is also characterized by its capabilities in specifying interleaving and weak sequencing which is considered by the OMG standard as the default semantics for interaction diagrams. Thus, this paper presents a novel approach to formalizing UML 2 sequence diagrams by translating them into <em>π</em>-calculus. The translation captures the semantics of their basic elements as well as their combined fragments. A compositional technique is adopted to gradually build the corresponding <em>π</em>-calculus specification which results in easy induction/recursion of elements and their meaning enabling reasoning about complex dynamic behaviors. The latter task could be done using different analysis tools such as the MWB tool used in this study. The mapping provides a formal semantics as well as formal analysis and verification for UML2 sequence diagrams according to the OMG standard. A case study is shown to illustrate the usefulness of the translation.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"246 ","pages":"Article 103334"},"PeriodicalIF":1.5,"publicationDate":"2025-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144167754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Caterina Urban , Pavle Subotić , Filip Drobnjaković
{"title":"Static analysis by abstract interpretation against data leakage in machine learning","authors":"Caterina Urban , Pavle Subotić , Filip Drobnjaković","doi":"10.1016/j.scico.2025.103338","DOIUrl":"10.1016/j.scico.2025.103338","url":null,"abstract":"<div><div>Data leakage is a well-known problem in machine learning which occurs when the training and testing datasets are not independent. This phenomenon leads to unreliably overly optimistic accuracy estimates at training time, followed by a significant drop in performance when models are deployed in the real world. This can be dangerous, notably when models are used for risk prediction in high-stakes applications. In this paper, we propose an abstract interpretation-based static analysis to prove the absence of data leakage at development time, long before model deployment and even before model training. We implemented it in the <span>NBLyzer</span> framework and we demonstrate its performance and precision on 2111 Jupyter notebooks from the Kaggle competition platform.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"246 ","pages":"Article 103338"},"PeriodicalIF":1.5,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144167755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xianzhiyu Li , Kunjian Song , Mikhail R. Gadelha , Franz Brauße , Rafael S. Menezes , Konstantin Korovin , Lucas C. Cordeiro
{"title":"ESBMC v7.6: Enhanced model checking of C++ programs with clang AST","authors":"Xianzhiyu Li , Kunjian Song , Mikhail R. Gadelha , Franz Brauße , Rafael S. Menezes , Konstantin Korovin , Lucas C. Cordeiro","doi":"10.1016/j.scico.2025.103336","DOIUrl":"10.1016/j.scico.2025.103336","url":null,"abstract":"<div><div>This paper presents Efficient SMT-Based Context-Bounded Model Checker (ESBMC) v7.6, an extended version based on previous work on ESBMC v7.3 by K. Song et al. <span><span>[1]</span></span>. The v7.3 introduced a new Clang-based C++ front-end to address the challenges posed by modern C++ programs. Although the new front-end has demonstrated significant potential in previous studies, it remains in the developmental stage and lacks several essential features. ESBMC v7.6 further enhanced this foundation by adding and extending features based on the Clang AST, such as <figure><img></figure> exception handling, <figure><img></figure> extended memory management and memory safety verification, including dangling pointers, duplicate deallocation, memory leaks and rvalue references and <figure><img></figure> new operational models for STL updating the outdated C++ operational models. Our extensive experiments demonstrate that ESBMC v7.6 can handle a significantly broader range of C++ features introduced in recent versions of the C++ standard.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"246 ","pages":"Article 103336"},"PeriodicalIF":1.5,"publicationDate":"2025-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144139534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Partha Protim Roy , Kumar Nitol , Teresa Gonçalves , Hasan Sarwar
{"title":"Software project management tools in practice in IT industry of Bangladesh","authors":"Partha Protim Roy , Kumar Nitol , Teresa Gonçalves , Hasan Sarwar","doi":"10.1016/j.scico.2025.103337","DOIUrl":"10.1016/j.scico.2025.103337","url":null,"abstract":"<div><div>The implementation of Software Project Management (SPM) has revolutionized the software development industry. For many years, Software Project Management Tools (SPMTs) have been widely adopted by software companies globally. Although the adoption of SPMTs has been slow in Bangladesh over the past few decades, there has been a growing trend of companies turning towards it. As Bangladesh strives to keep up with the rest of the world, it is important to understand how software development is managed in the country. The adoption of SPMTs has been investigated from two perspectives, which made this work being conducted in two parts. In the first part, a systematic literature review attempts to explore the use of SPM and SPMTs from a global perspective, a country-specific perspective, and finally, a Bangladeshi perspective. The second part investigates the actual use of SPMTs in the real field through conducting a comprehensive survey, comprising 52 questions, on software companies. The analysis was based on 87 responses from participant companies. The key findings reveal that nearly 50 % of the companies employ SPMTs for project management, with Jira emerging as the most popular tool, holding the largest market share at about 45 %. Our study identified 10 frequently used SPMT functionalities. Users reported that lack of knowledge, cost, and perceived necessity hampered SPMT adoption. The insights gained can benefit researchers and policymakers in enhancing the use of these tools further and fostering improved practices in the sector for sustainable growth.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"247 ","pages":"Article 103337"},"PeriodicalIF":1.5,"publicationDate":"2025-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144253394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gabriel Aracena , Kyle Luster , Fabio Santos , Igor Steinmacher , Marco A. Gerosa
{"title":"Applying large language models to issue classification: Revisiting with extended data and new models","authors":"Gabriel Aracena , Kyle Luster , Fabio Santos , Igor Steinmacher , Marco A. Gerosa","doi":"10.1016/j.scico.2025.103333","DOIUrl":"10.1016/j.scico.2025.103333","url":null,"abstract":"<div><div>Effective prioritization of issue reports in software engineering helps to optimize resource allocation and information recovery. However, manual issue classification is laborious and lacks scalability. As an alternative, many open source software (OSS) projects employ automated processes for this task, yet this method often relies on large datasets for adequate training. Traditionally, machine learning techniques have been used for issue classification. More recently, large language models (LLMs) have emerged as powerful tools for addressing a range of software engineering challenges, including code and test generation, mapping new requirements to legacy software endpoints, and conducting code reviews. The following research investigates an automated approach to issue classification based on LLMs. By leveraging the capabilities of such models, we aim to develop a robust system for prioritizing issue reports, mitigating the necessity for extensive training data while maintaining classification reliability. In our research, we developed an LLM-based approach for accurately labeling issues by selecting two of the most prominent large language models. We then compared their performance across multiple datasets. Our findings show that GPT-4o achieved the best results in classifying issues from the NLBSE 2024 competition. Moreover, GPT-4o outperformed DeepSeek R1, achieving an F1 score 20% higher when both models were trained on the same dataset from the NLBSE 2023 competition, which was ten times larger than the NLBSE 2024 dataset. The fine-tuned GPT-4o model attained an average F1 score of 80.7%, while the fine-tuned DeepSeek R1 model achieved 59.33%. Increasing the dataset size did not improve the F1 score, reducing the dependence on massive datasets for building an efficient solution to issue classification. Notably, in individual repositories, some of our models predicted issue labels with a precision greater than 98%, a recall of 97%, and an F1 score of 90%.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"246 ","pages":"Article 103333"},"PeriodicalIF":1.5,"publicationDate":"2025-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144134847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SMT-based robust model checking for signal temporal logic","authors":"Jia Lee, Geunyeol Yu, Kyungmin Bae","doi":"10.1016/j.scico.2025.103332","DOIUrl":"10.1016/j.scico.2025.103332","url":null,"abstract":"<div><div>Signal temporal logic (STL) is a temporal logic used to specify properties of continuous signals. STL has been widely applied in specifying, monitoring, and testing properties of hybrid systems that exhibit both discrete and continuous behavior. However, model checking techniques for hybrid systems have primarily been limited to invariant and reachability properties. This paper introduces bounded model checking algorithms and a tool for general STL properties of hybrid systems. Central to our technique is a novel logical foundation for STL, which includes: (i) syntactic separation, decomposing an STL formula into components, with each component depending exclusively on separate segments of a signal; (ii) signal discretization, ensuring a complete abstraction of a signal through a set of discrete elements; and (iii) <em>ϵ</em>-strengthening, reducing robust STL model checking to Boolean STL model checking. With this new foundation, the robust STL model checking problem can be reduced to the satisfiability of a first-order logic formula. This allows us to develop the first model checking algorithm for STL that can guarantee the correctness of STL up to given bound parameters and robustness threshold, along with a pioneering bounded model checker for hybrid systems, called <span>STLmc</span>. We demonstrate the effectiveness of <span>STLmc</span> on a number of hybrid system case studies.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"246 ","pages":"Article 103332"},"PeriodicalIF":1.5,"publicationDate":"2025-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144107524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Man Zhang , Andrea Arcuri , Yonggang Li , Yang Liu , Kaiming Xue , Zhao Wang , Jian Huo , Weiwei Huang
{"title":"Fuzzing microservices: A series of user studies in industry on industrial systems with EvoMaster","authors":"Man Zhang , Andrea Arcuri , Yonggang Li , Yang Liu , Kaiming Xue , Zhao Wang , Jian Huo , Weiwei Huang","doi":"10.1016/j.scico.2025.103322","DOIUrl":"10.1016/j.scico.2025.103322","url":null,"abstract":"<div><div>With several microservice architectures comprising thousands of web services in total, used to serve 630 million customers, companies like Meituan face several challenges in the verification and validation of their software. The use of automated techniques, especially advanced AI-based ones, could bring significant benefits here. <span>EvoMaster</span> is an open-source test case generation tool for web services, that exploits the latest advances in the field of Search-Based Software Testing research. This paper reports on our experience of integrating the <span>EvoMaster</span> tool in the testing processes at Meituan over almost 2 years (i.e., between October 2021 and July 2023). Two user studies were carried out in 2021 (with two industrial APIs) and in 2023 (with three industrial APIs) to evaluate two versions of <span>EvoMaster</span> (i.e., v1.3.0 and v1.6.1), respectively, in tackling the test generation for industrial web services which are parts of a large e-commerce microservice system. The two user studies involve in total 321,131 lines of code from these five APIs and 27 industrial participants at Meituan. Questionnaires and interviews were carried out in both user studies with the engineers and managers at Meituan. The two user studies demonstrate clear advantages of <span>EvoMaster</span> (in terms of code coverage and fault detection) and the urgent need to have such a fuzzer in industrial microservices testing. Given its clear advantages, <span>EvoMaster</span> now has been integrated into the industrial testing pipelines at Meituan. To study how these results could generalize, a follow up user study was done in 2024 (with <span>EvoMaster</span> v2.0.0) with five engineers in the five different companies. Our results show that, besides their clear usefulness, there are still many critical challenges that the research community needs to investigate to improve performance further.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"246 ","pages":"Article 103322"},"PeriodicalIF":1.5,"publicationDate":"2025-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144167753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}