Xianzhiyu Li , Kunjian Song , Mikhail R. Gadelha , Franz Brauße , Rafael S. Menezes , Konstantin Korovin , Lucas C. Cordeiro
{"title":"ESBMC v7.6: Enhanced model checking of C++ programs with clang AST","authors":"Xianzhiyu Li , Kunjian Song , Mikhail R. Gadelha , Franz Brauße , Rafael S. Menezes , Konstantin Korovin , Lucas C. Cordeiro","doi":"10.1016/j.scico.2025.103336","DOIUrl":"10.1016/j.scico.2025.103336","url":null,"abstract":"<div><div>This paper presents Efficient SMT-Based Context-Bounded Model Checker (ESBMC) v7.6, an extended version based on previous work on ESBMC v7.3 by K. Song et al. <span><span>[1]</span></span>. The v7.3 introduced a new Clang-based C++ front-end to address the challenges posed by modern C++ programs. Although the new front-end has demonstrated significant potential in previous studies, it remains in the developmental stage and lacks several essential features. ESBMC v7.6 further enhanced this foundation by adding and extending features based on the Clang AST, such as <figure><img></figure> exception handling, <figure><img></figure> extended memory management and memory safety verification, including dangling pointers, duplicate deallocation, memory leaks and rvalue references and <figure><img></figure> new operational models for STL updating the outdated C++ operational models. Our extensive experiments demonstrate that ESBMC v7.6 can handle a significantly broader range of C++ features introduced in recent versions of the C++ standard.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"246 ","pages":"Article 103336"},"PeriodicalIF":1.5,"publicationDate":"2025-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144139534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gabriel Aracena , Kyle Luster , Fabio Santos , Igor Steinmacher , Marco A. Gerosa
{"title":"Applying large language models to issue classification: Revisiting with extended data and new models","authors":"Gabriel Aracena , Kyle Luster , Fabio Santos , Igor Steinmacher , Marco A. Gerosa","doi":"10.1016/j.scico.2025.103333","DOIUrl":"10.1016/j.scico.2025.103333","url":null,"abstract":"<div><div>Effective prioritization of issue reports in software engineering helps to optimize resource allocation and information recovery. However, manual issue classification is laborious and lacks scalability. As an alternative, many open source software (OSS) projects employ automated processes for this task, yet this method often relies on large datasets for adequate training. Traditionally, machine learning techniques have been used for issue classification. More recently, large language models (LLMs) have emerged as powerful tools for addressing a range of software engineering challenges, including code and test generation, mapping new requirements to legacy software endpoints, and conducting code reviews. The following research investigates an automated approach to issue classification based on LLMs. By leveraging the capabilities of such models, we aim to develop a robust system for prioritizing issue reports, mitigating the necessity for extensive training data while maintaining classification reliability. In our research, we developed an LLM-based approach for accurately labeling issues by selecting two of the most prominent large language models. We then compared their performance across multiple datasets. Our findings show that GPT-4o achieved the best results in classifying issues from the NLBSE 2024 competition. Moreover, GPT-4o outperformed DeepSeek R1, achieving an F1 score 20% higher when both models were trained on the same dataset from the NLBSE 2023 competition, which was ten times larger than the NLBSE 2024 dataset. The fine-tuned GPT-4o model attained an average F1 score of 80.7%, while the fine-tuned DeepSeek R1 model achieved 59.33%. Increasing the dataset size did not improve the F1 score, reducing the dependence on massive datasets for building an efficient solution to issue classification. Notably, in individual repositories, some of our models predicted issue labels with a precision greater than 98%, a recall of 97%, and an F1 score of 90%.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"246 ","pages":"Article 103333"},"PeriodicalIF":1.5,"publicationDate":"2025-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144134847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SMT-based robust model checking for signal temporal logic","authors":"Jia Lee, Geunyeol Yu, Kyungmin Bae","doi":"10.1016/j.scico.2025.103332","DOIUrl":"10.1016/j.scico.2025.103332","url":null,"abstract":"<div><div>Signal temporal logic (STL) is a temporal logic used to specify properties of continuous signals. STL has been widely applied in specifying, monitoring, and testing properties of hybrid systems that exhibit both discrete and continuous behavior. However, model checking techniques for hybrid systems have primarily been limited to invariant and reachability properties. This paper introduces bounded model checking algorithms and a tool for general STL properties of hybrid systems. Central to our technique is a novel logical foundation for STL, which includes: (i) syntactic separation, decomposing an STL formula into components, with each component depending exclusively on separate segments of a signal; (ii) signal discretization, ensuring a complete abstraction of a signal through a set of discrete elements; and (iii) <em>ϵ</em>-strengthening, reducing robust STL model checking to Boolean STL model checking. With this new foundation, the robust STL model checking problem can be reduced to the satisfiability of a first-order logic formula. This allows us to develop the first model checking algorithm for STL that can guarantee the correctness of STL up to given bound parameters and robustness threshold, along with a pioneering bounded model checker for hybrid systems, called <span>STLmc</span>. We demonstrate the effectiveness of <span>STLmc</span> on a number of hybrid system case studies.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"246 ","pages":"Article 103332"},"PeriodicalIF":1.5,"publicationDate":"2025-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144107524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Certified control for train sign classification","authors":"Jan Roßbach, Michael Leuschel","doi":"10.1016/j.scico.2025.103323","DOIUrl":"10.1016/j.scico.2025.103323","url":null,"abstract":"<div><div>Certified control makes it possible to use artificial intelligence for safety-critical systems. It is a runtime monitoring architecture, which requires an AI to provide certificates for its decisions; these certificates can then be checked by a separate classical system. In this article, we evaluate the practicality of certified control for providing formal guarantees about an AI-based perception system. In this case study, we implemented a certificate checker that uses classical computer vision algorithms to verify railway signs detected by an AI object detection model. We have integrated this prototype with the popular object detection model YOLO. Performance metrics on generated data are promising for the use-case, but further research is needed to generalize certified control for other tasks.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"246 ","pages":"Article 103323"},"PeriodicalIF":1.5,"publicationDate":"2025-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144068896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Toward a prioritization approach for third-party software library updates","authors":"Abdalrahman Aburakhia , Mohammad Alshayeb","doi":"10.1016/j.scico.2025.103331","DOIUrl":"10.1016/j.scico.2025.103331","url":null,"abstract":"<div><div>Third-party libraries (TPLs) have been widely used in software development. Recent studies showed that software developers struggle to manage the dependencies between third-party libraries for many reasons, such as unknown update efforts and the lack of awareness about related security issues. To overcome these limitations, in this paper, we propose a TPL update prioritization approach, which provides valuable insights for mobile app developers to help improve the decision-making process. We investigate mobile app developers’ behavior while updating TPLs through a survey with 39 practitioners. The results clearly show the need for a prioritization approach. To gain more insight into TPL, we propose five TPL categories (Compatibility, Accessibility, Maintenance, Business Value, and Security) and propose metrics to measure the related factors of each category. We utilize the Analytical Hierarchy Process (AHP) and the Simple Additive Weighting (SAW) methods to rank the libraries for the update and automate the approach via a chatbot. We conducted a case study with 7 participants. Most participants (82 %) found the bot’s results useful; moreover, the bot can save software developers around 4 min per task, with an average of 18 s per task compared to 243 s by the baseline.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"246 ","pages":"Article 103331"},"PeriodicalIF":1.5,"publicationDate":"2025-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144068895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"“Your AI is impressive, but my code does not have any bugs” managing false positives in industrial contexts","authors":"Szymon Stradowski , Lech Madeyski","doi":"10.1016/j.scico.2025.103320","DOIUrl":"10.1016/j.scico.2025.103320","url":null,"abstract":"<div><h3>Context</h3><div>“Your AI is impressive, but my code does not contain any bugs”— such a statement from a software developer is the antithesis of a quality mindset and open communication. What makes it worse is that it is oftentimes true.</div></div><div><h3>Objective</h3><div>This paper analyses false positives' impact and related challenges in machine learning software defect prediction and describes the mitigation possibilities.</div></div><div><h3>Methods</h3><div>We propose a broad-picture perspective on dealing with false positive predictions based on what we learned from our industrial implementation study in Nokia 5G.</div></div><div><h3>Results</h3><div>Accordingly, we draw a new direction in transitioning defect prediction into a well-established industry practice, as well as highlight potential emerging topics in predictive software engineering.</div></div><div><h3>Conclusion</h3><div>Increasing human buy-in and the business impact of predictions significantly improves the chances of future software defect prediction industry adoptions to succeed.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"246 ","pages":"Article 103320"},"PeriodicalIF":1.5,"publicationDate":"2025-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143924121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Investigating quality aspects for UX evaluation of IoT-based applications in smart cities: A literature review","authors":"Joelma Choma, Luciana Zaina","doi":"10.1016/j.scico.2025.103319","DOIUrl":"10.1016/j.scico.2025.103319","url":null,"abstract":"<div><div>The Internet of Things (IoT) has increasingly gained prominence in developing smart cities. IoT technologies are essential resources to make smart cities more efficient and sustainable. Recent research in Software Engineering (SE) has investigated the characteristics of IoT systems and the most appropriate approaches to their design and development. The development of systems based on IoT technologies enables a continuous flow of communication in the context of a smart city by allowing different systems to interact and adjust automatically to optimize the city's operation. In an urban environment, IoT connects a vast network of devices such as environmental sensors, public transportation systems, smart traffic lights, security cameras, and more. These characteristics make these applications complex and difficult to evaluate, particularly regarding User Experience (UX) design. Recently, we performed a rapid systematic review to examine the methods and practices commonly employed for evaluating the UX in these scenarios. In our previous work, we analyzed 43 studies covering different types of IoT-based applications and areas of smart cities. In this study, we extend our analysis by exploring which quality aspects have been considered for UX evaluation and categorizing the typical applications evaluated. Our findings revealed the need for more appropriate UX instruments to assess quality aspects that consider specific features of non-traditional interfaces (e.g., haptics, gesture, speech) and smart technologies within specific interaction contexts (e.g., smart environments based on ubiquitous computing). These instruments can be expanded from established guidelines or developed from scratch as long as they are validated in practice.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"245 ","pages":"Article 103319"},"PeriodicalIF":1.5,"publicationDate":"2025-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143882140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Analyzing MoWebA's adaptability across architectures","authors":"Magalí González, Luca Cernuzzi","doi":"10.1016/j.scico.2025.103318","DOIUrl":"10.1016/j.scico.2025.103318","url":null,"abstract":"<div><div>Specifying architectural properties is still an open issue for Model Driven Web Engineering to address portability, adaptability, and evolution. Several Model-Driven Web methods and methodologies consider extensions that enrich the Platform Independent Model (PIM) or the Platform Specific Model (PSM) to include elements of a particular platform or architecture. However, the degree of independence of the model is critical to achieving adaptability and evolution. Therefore, some authors have proposed a new layer called an Architecture Specific Model (ASM) to model architectural properties. Some evidence suggests that adopting ASM as an intermediate stage between PIM and PSM is a way to facilitate the evolution of web system. This paper focuses on the Architecture Specific Model (ASM) of MoWebA (Model Oriented Web Approach), and analyzes its impact on adaptability across different architectural styles. A case study is presented to validate this issue by extending MoWebA to three different architectures. In such extensions, we analyze the degree of adaptability of MoWebA and the automation of PIM-ASM, as well as the degree of independence of the PIM metamodel. In addition, through three types of questionnaire and other quantitative data, the study analyzes user satisfaction with the adoption of MoWebA.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"245 ","pages":"Article 103318"},"PeriodicalIF":1.5,"publicationDate":"2025-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143903524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiangyu Mu , Xuan Zhang , Chenlu Zhu , Ning Li , Peng Zhang , Lei Liu
{"title":"Testing non-commutativity of reduce functions with multi-column inputs","authors":"Xiangyu Mu , Xuan Zhang , Chenlu Zhu , Ning Li , Peng Zhang , Lei Liu","doi":"10.1016/j.scico.2025.103317","DOIUrl":"10.1016/j.scico.2025.103317","url":null,"abstract":"<div><div>With the continuous development of the MapReduce programming model, it is necessary to ensure the reliability of MapReduce programs. In practice, the non-commutativity of Reduce functions seriously affects the reliability of the MapReduce program, which is difficult to debug and even causes errors. Current researches on the non-commutability detection of Reduce function consider the case that the input value is a single attribute. However, such researches ignore the situation where inputs to most reduce functions in practical applications consist of multiple columns (such as a table). To test the commutativity of reduce functions where each input record may contain several input attributes, a new testing method is proposed. This approach uses symbolic execution tools to help generate a few input records, and breaks their data dependencies to generate an original test case t<sub>0</sub>, with a dynamic program slicing technique to lessen the scale of t<sub>0</sub>. And the ultimate test suite is consisted of different permutations of records in t<sub>0</sub>. In the end, experiments demonstrate the effectiveness of our testing method and that the permutation method Gm is helpful to reduce its complexity.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"245 ","pages":"Article 103317"},"PeriodicalIF":1.5,"publicationDate":"2025-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143856028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Formal specification and SMT verification of quantized neural network for autonomous vehicles","authors":"Wahiba Bachiri , Yassamine Seladji , Pierre-Loïc Garoche","doi":"10.1016/j.scico.2025.103316","DOIUrl":"10.1016/j.scico.2025.103316","url":null,"abstract":"<div><div>The complexity of Autonomous Vehicles imposes significant challenges to their formal specification and verification, especially when incorporating AI controllers based on quantized neural networks (QNNs), which use fixed-point arithmetic to accommodate the limited computational capabilities of embedded systems. Despite the advantages of QNNs, verification of these networks, whether using integers or bit vectors, has proven to be <span>PSPACE</span>-hard.</div><div>Our approach focuses on exhaustively verifying abstract scenarios expressed as Satisfiability Modulo Theories (SMT) proof objectives. We propose a formal verification method for QNNs that involves analyzing a rational approximation of the network with perturbations to ensure that the output sets of the perturbed rational neural network include those of both the QNN and its rational neural network approximation.</div><div>The distance between these output sets is computed using the <em>p</em>-norm. To evaluate our methodology, we used the <span>Highway-env</span> autonomous vehicle simulator and z3 SMT solver.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"245 ","pages":"Article 103316"},"PeriodicalIF":1.5,"publicationDate":"2025-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143856029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}