Rodrigo Oliveira Zacarias , Léo Carvalho Ramos Antunes , Márcio de Oliveira Barros , Rodrigo Pereira dos Santos , Patricia Lago
{"title":"Exploring developer experience factors in software ecosystems","authors":"Rodrigo Oliveira Zacarias , Léo Carvalho Ramos Antunes , Márcio de Oliveira Barros , Rodrigo Pereira dos Santos , Patricia Lago","doi":"10.1016/j.jss.2025.112549","DOIUrl":"10.1016/j.jss.2025.112549","url":null,"abstract":"<div><h3>Context:</h3><div>Developer experience (DX) plays a key role in developers’ performance and their continued involvement in a software ecosystem (SECO) platform. While researchers and practitioners have recognized several factors affecting DX in SECO platforms, a clear roadmap of the most influential factors is still missing. This is particularly important given the direct impact on developers’ interest in SECO and their ongoing engagement with the common technological platform.</div></div><div><h3>Goal:</h3><div>This work aims to identify key DX factors and understand how they influence third-party developers’ decisions to adopt and keep contributing to a SECO.</div></div><div><h3>Methods:</h3><div>We conducted a systematic mapping study (SMS), analyzing 29 studies to assess the state-of-the-art of DX in SECO. Additionally, we conducted a Delphi study to evaluate the influence of 27 DX factors (identified in our SMS) from the perspective of 21 third-party developers to adopt and keep contributing to a SECO.</div></div><div><h3>Results:</h3><div>The factors that most strongly influence developers’ adoption and ongoing contributions to a SECO are: “financial costs for using the platform”, “desired technical resources for development”, “low barriers to entry into the applications market”, and “more financial gains”.</div></div><div><h3>Conclusion:</h3><div>DX is essential for the success and sustainability of SECO. Our set of DX factors provides valuable insights and recommendations for researchers and practitioners to address key DX concerns from the perspective of third-party developers.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"230 ","pages":"Article 112549"},"PeriodicalIF":4.1,"publicationDate":"2025-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144749136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Muhammad Waseem , Aakash Ahmad , Peng Liang , Muhammad Azeem Akbar , Arif Ali Khan , Iftikhar Ahmad , Manu Setälä , Tommi Mikkonen
{"title":"Containerization in multi-cloud environment: Roles, strategies, challenges, and solutions for effective implementation","authors":"Muhammad Waseem , Aakash Ahmad , Peng Liang , Muhammad Azeem Akbar , Arif Ali Khan , Iftikhar Ahmad , Manu Setälä , Tommi Mikkonen","doi":"10.1016/j.jss.2025.112558","DOIUrl":"10.1016/j.jss.2025.112558","url":null,"abstract":"<div><div>Containerization in multi-cloud environments has received significant attention in recent years both from academic research and industrial development perspectives. However, there exists no effort to systematically investigate the state of research on this topic. The aim of this research is to systematically identify and categorize the multiple aspects of containerization in multi-cloud environment. We conducted the Systematic Mapping Study (SMS) on the literature published between January 2013 and July 2024. One hundred twenty one studies were selected and the key results are: (1) Four leading themes on containerization in multi-cloud environment are identified: ‘Scalability and High Availability’, ‘Performance and Optimization’, ‘Security and Privacy’, and ‘Multi-Cloud Container Monitoring and Adaptation’. (2) Ninety-eight patterns and strategies for containerization in multi-cloud environment were classified across 10 subcategories and 4 categories. (3) Ten quality attributes considered were identified with 47 associated tactics. (4) Four catalogs consisting of challenges and solutions related to security, automation, deployment, and monitoring were introduced. The results of this SMS will assist researchers and practitioners in pursuing further studies on containerization in multi-cloud environment and developing specialized solutions for containerization applications in multi-cloud environment.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"230 ","pages":"Article 112558"},"PeriodicalIF":4.1,"publicationDate":"2025-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144780882","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Marc Herrmann , Alexander Specht , Abdurrahman Sekerci , Martin Obaidi , Marco Ehl , Duaa Adel Ali Elsofi , Katharina Großer , Jil Klünder , Jan Jürjens , Kurt Schneider
{"title":"From missile warhead to smart fridge: Interviews with industry experts on tracing safety- and security-relevant artifacts","authors":"Marc Herrmann , Alexander Specht , Abdurrahman Sekerci , Martin Obaidi , Marco Ehl , Duaa Adel Ali Elsofi , Katharina Großer , Jil Klünder , Jan Jürjens , Kurt Schneider","doi":"10.1016/j.jss.2025.112551","DOIUrl":"10.1016/j.jss.2025.112551","url":null,"abstract":"<div><div>Ensuring traceability of safety- and security-related artifacts is vital in software development to comply with standards and mitigate risks. Despite its importance, the practical implementation of defining and tracing safety- and security-relevant artifacts remains ambiguous. Based on eight semi-structured interviews with industry experts, this work explores the definitions, methods, processes, and challenges of tracing safety- and security-related artifacts. The interviews revealed that definitions of safety- and security-relevant artifacts are highly context-dependent, shaped by regulatory standards, internal processes, technical characteristics, and practitioner judgment. Rather than signaling a deficiency, this variability reflects the inherently multifaceted nature of safety and security work, where artifact classification emerges from practical reasoning rather than strict or universal criteria. Tools play a key role in supporting traceability, and cross-team alignment remains a concern in practice. Our findings provide actionable insights for organizations seeking to strengthen traceability. The recommendations encourage the development of internal classification criteria, support effective collaboration with external partners, support guidance, onboarding, and training, and help align practices across teams, fostering more reliable and transparent management of safety- and security-relevant artifacts.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"230 ","pages":"Article 112551"},"PeriodicalIF":3.7,"publicationDate":"2025-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144704733","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A novel LLM-based classifier for predicting bug-fixing time in Bug Tracking Systems","authors":"Pasquale Ardimento , Michele Capuzzimati , Gabriella Casalino , Daniele Schicchi , Davide Taibi","doi":"10.1016/j.jss.2025.112569","DOIUrl":"10.1016/j.jss.2025.112569","url":null,"abstract":"<div><div>Predicting whether a newly submitted bug will be resolved quickly or slowly is a crucial aspect of the bug triage process, as it enables project managers to estimate software maintenance efforts and manage development workflows more effectively. This paper proposes a deep learning approach for classifying bug reports into two categories—<em>FAST</em> or <em>SLOW</em>—based on their expected fixing time. The method leverages a feature set composed of the bug description and reporter comments and adopts a transfer learning strategy using pre-trained Large Language Models (LLMs). The problem is framed as a supervised text classification task, where LLMs exploit their ability to learn rich contextual representations of language. We introduce a novel classification workflow that guides the LLM through a structured prompt, combining two design patterns: the persona pattern to contextualize the task and the input semantic pattern to organize textual information. The workflow relies on zero-shot learning to assess whether the intrinsic knowledge embedded in the LLMs is sufficient for this prediction task. We conducted a comprehensive evaluation of three state-of-the-art LLMs across multiple real-world datasets sourced from Bugzilla, encompassing a diverse range of software projects. The experimental results demonstrate that the proposed method is effective in accurately identifying fast-resolving bugs. Among the evaluated models, <span>LLaMA3-8B</span> consistently delivered superior performance. Additionally, the absence of statistically significant performance variations across datasets highlights the generalizability of the approach. Notably, the LLMs maintained strong performance even on small and imbalanced datasets, underscoring their robustness and practical applicability in real-world, data-scarce scenarios.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"230 ","pages":"Article 112569"},"PeriodicalIF":3.7,"publicationDate":"2025-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144685990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dulaji Hidellaarachchi , John Grundy , Rashina Hoda
{"title":"The role of humour in software engineering—A literature review and preliminary taxonomy","authors":"Dulaji Hidellaarachchi , John Grundy , Rashina Hoda","doi":"10.1016/j.jss.2025.112560","DOIUrl":"10.1016/j.jss.2025.112560","url":null,"abstract":"<div><div>Humour has long been recognized as a key factor in enhancing creativity, group effectiveness, and employee well-being across various domains. However, its occurrence and impact within software engineering (SE) teams remains under-explored. This paper introduces a comprehensive, literature review-based taxonomy exploring the characterization and use of humour in SE teams, with the goal of boosting productivity, improving communication, and fostering a positive work environment while emphasizing the responsible use of humour to mitigate its potential negative impacts. Drawing from a wide array of studies in psychology, sociology, and organizational behaviour, our proposed framework categorizes humour into distinct theories, styles, models, and scales, offering SE professionals and researchers a structured approach to understanding humour in their work. This study also addresses the unique challenges of applying humour in SE, highlighting its potential benefits while acknowledging the need for further empirical validation in this context. Ultimately, our study aims to pave the way for more cohesive, creative, and psychologically supportive SE environments through the strategic use of humour.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"230 ","pages":"Article 112560"},"PeriodicalIF":3.7,"publicationDate":"2025-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144711011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yijie Ou, Chenghao Su, Lin Chen, Yanhui Li, Yuming Zhou
{"title":"Binding of C++ and JavaScript through automated glue code generation","authors":"Yijie Ou, Chenghao Su, Lin Chen, Yanhui Li, Yuming Zhou","doi":"10.1016/j.jss.2025.112565","DOIUrl":"10.1016/j.jss.2025.112565","url":null,"abstract":"<div><div>Modern software systems frequently employ multiple programming languages to harness unique strengths of each language. One such language is JavaScript, which enjoys widespread usage, particularly in client-side web development. Nonetheless, JavaScript lacks native support for low-level operations. To address this limitation, developers often repurpose pre-existing C/C++ modules, but they need to write Glue Code to facilitate the adaptation of C/C++ modules to the JavaScript environment. Reusing C/C++ modules can significantly enhance software performance and usability, making it highly sought-after in practical applications.</div><div>In response to this demand, we present an innovative approach to generating C-to-JavaScript glue code, grounded in a rewrite system. Initially, we define the glue code generation problem with a formal multi-language system that captures interoperation semantics of JavaScript and C++. Subsequently, we abstract a framework for generating glue code and utilize a parameterized template-based rewrite system to implement this framework. Lastly, we implement our method through a tool named CJSBinder and conduct a systematic evaluation, analyzing its accuracy, efficiency, and practical applicability compared to a state-of-the-art tool. The results demonstrate that CJSBinder excels at efficiently generating accurate glue code and is compatible with a variety of real-world project contexts.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"230 ","pages":"Article 112565"},"PeriodicalIF":3.7,"publicationDate":"2025-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144685989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tuan-Dung Bui, Thanh Trong Vu, Thu-Trang Nguyen, Son Nguyen, Hieu Dinh Vo
{"title":"Correctness assessment of code generated by Large Language Models using internal representations","authors":"Tuan-Dung Bui, Thanh Trong Vu, Thu-Trang Nguyen, Son Nguyen, Hieu Dinh Vo","doi":"10.1016/j.jss.2025.112570","DOIUrl":"10.1016/j.jss.2025.112570","url":null,"abstract":"<div><div>Ensuring the correctness of code generated by Large Language Models (LLMs) presents a significant challenge in AI-driven software development. Existing methods predominantly rely on black-box (closed-box) approaches that evaluate correctness post-generation failing to utilize the rich insights embedded in the LLMs’ internal states during code generation. This limitation leads to delayed error detection, increased debugging costs, and reduced reliability in deployed AI-assisted coding workflows. In this paper, we introduce <span>Openia</span>, a novel white-box (open-box) framework that leverages these internal representations to assess the correctness of LLM-generated code. By systematically analyzing the intermediate states of representative open-source code LLMs, including DeepSeek-Coder, <span>Code Llama</span>, and <span>Magicoder</span>, across diverse code generation benchmarks, we found that these internal representations encode latent information, which strongly correlates with the correctness of the generated code. Building on these insights, <span>Openia</span> uses a white-box/open-box approach to make informed predictions about code correctness, offering significant advantages in adaptability and robustness over traditional blackbox methods and zero-shot approaches. Our results show that <span>Openia</span> consistently outperforms baseline models, achieving higher accuracy, precision, recall, and F1-Scores with up to a 2X improvement in standalone code generation and a 3X enhancement in repository-specific scenarios. By unlocking the potential of in-process signals, <span>Openia</span> paves the way for more proactive and efficient quality assurance mechanisms in LLM-assisted code generation.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"230 ","pages":"Article 112570"},"PeriodicalIF":3.7,"publicationDate":"2025-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144696379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"GDFuzz: An efficient directed fuzzing method based on XAI","authors":"Junru Li, Zhongxu Yin, Guoxiao Zong, Haiya Sang","doi":"10.1016/j.jss.2025.112568","DOIUrl":"10.1016/j.jss.2025.112568","url":null,"abstract":"<div><div>As software systems grow increasingly complex, traditional fuzz testing struggles with inefficiencies, limited coverage, and poor adaptability to large-scale codebases with intricate input formats. These challenges are exacerbated in directed fuzz testing, where a lack of explainability often hampers effective user intervention and fine-grained input control, limiting both manageability and precision. To address these limitations, this paper introduces GDFuzz, an innovative fuzz testing framework that enhances directed testing by integrating explainable deep learning and interactive controllability. At its core, GDFuzz leverages the Local Interpretable Model-Agnostic Explanations (LIME) technique to identify and protect critical input bytes that influence the execution path to target locations. This mechanism, termed Sample Masks, guides mutation strategies toward more effective and targeted exploration. Furthermore, GDFuzz provides a real-time visual intervention interface, enabling users to dynamically adjust strategies based on neural network outputs, bridging the gap between automated testing and human expertise. Additionally, GDFuzz integrates a Dual-Queue Cyclic Simulated Annealing (DQCSA) algorithm and adaptive incremental learning to adapt to our explanation mechanism. We validate GDFuzz through comparative experiments on five real-world applications against four state-of-the-art fuzzers: AFL, AFL++, SelectFuzz, FormatFuzzer and AFLGo. Results demonstrate that GDFuzz improves effective execution rates by an average of 185% across all benchmarks, highlighting the impact of its interpretability-driven enhancements. Notably, GDFuzz uncovered eight previously undisclosed vulnerabilities (0-days) in widely used open-source libraries, with four assigned CVE identifiers and four pending review. These achievements underline the value of explainable and controllable testing in advancing directed fuzzing methodologies.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"230 ","pages":"Article 112568"},"PeriodicalIF":3.7,"publicationDate":"2025-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144678927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Khashayar Etemadi , Bardia Mohammadi , Zhendong Su , Martin Monperrus
{"title":"Mokav: Execution-driven differential testing with LLMs","authors":"Khashayar Etemadi , Bardia Mohammadi , Zhendong Su , Martin Monperrus","doi":"10.1016/j.jss.2025.112571","DOIUrl":"10.1016/j.jss.2025.112571","url":null,"abstract":"<div><div>It is essential to detect functional differences between programs in various software engineering tasks, such as automated program repair, mutation testing, and code refactoring. The problem of detecting functional differences between two programs can be reduced to searching for a difference exposing test (DET): a test input that results in different outputs on the subject programs. In this paper, we propose <span>Mokav</span>, a novel execution-driven tool that leverages LLMs to generate DETs. <span>Mokav</span> takes two versions of a program (P and Q) and an example test input. When successful, <span>Mokav</span> generates a valid DET, a test input that leads to provably different outputs on P and Q. <span>Mokav</span> iteratively prompts an LLM with a specialized prompt to generate new test inputs. At each iteration, <span>Mokav</span> provides execution-based feedback from previously generated tests until the LLM produces a DET. We evaluate <span>Mokav</span> on 1535 pairs of Python programs collected from the Codeforces competition platform and 32 pairs of programs from the QuixBugs dataset. Our experiments show that <span>Mokav</span> outperforms the state-of-the-art, Pynguin and Differential Prompting, by a large margin. <span>Mokav</span> can generate DETs for 81.7% (1,255/1535) of the program pairs in our benchmark (versus 4.9% for Pynguin and 37.3% for Differential Prompting). We demonstrate that the iterative and execution-driven feedback components of the system contribute to its high effectiveness.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"230 ","pages":"Article 112571"},"PeriodicalIF":3.7,"publicationDate":"2025-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144672649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Incentive mechanism for mobile crowd sensing with Assumed Bid Cost Reverse Auction","authors":"Jowa Yangchin, Ningrinla Marchang","doi":"10.1016/j.jss.2025.112572","DOIUrl":"10.1016/j.jss.2025.112572","url":null,"abstract":"<div><div>Mobile Crowd Sensing (MCS) is the mechanism wherein people can contribute in data collection process using their own mobile devices which have sensing capabilities. Incentives are rewards that individuals get in exchange for data they submit. Reverse Auction Bidding (RAB) is a framework that allows users to place bids for selling the data they collected. Task providers can select users to buy data from by looking at bids. Using the RAB framework, MCS system can be optimized for better user utility, task provider utility and platform utility. In this paper, we propose a novel approach called Reverse Auction with Assumed Bid Cost (RA-ABC) which allows users to place a bid in the system before collecting data. We opine that performing the tasks only after winning helps in reducing resource consumption instead of performing the tasks before bidding. User Return on Investment (ROI) is calculated with which they decide to further participate or not by either increasing or decreasing their bids. We also propose an extension of RA-ABC with dynamic recruitment (RA-ABCDR) in which we allow new users to join the system at any time during bidding rounds. Simulation results demonstrate that RA-ABC and RA-ABCDR outperform the widely used Tullock Optimal Prize Function, with RA-ABCDR achieving up to 54.6% higher user retention and reducing auction cost by 22.2%, thereby ensuring more efficient and sustainable system performance. Extensive simulations confirm that dynamic user recruitment significantly enhances performance across stability, fairness, and cost-efficiency metrics.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"230 ","pages":"Article 112572"},"PeriodicalIF":3.7,"publicationDate":"2025-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144672650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}