{"title":"Representing and Reasoning with Non-Functional Requirements: A Retrospective","authors":"John Mylopoulos, Lawrence Chung","doi":"10.1109/tse.2025.3533418","DOIUrl":"https://doi.org/10.1109/tse.2025.3533418","url":null,"abstract":"","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":"34 1","pages":""},"PeriodicalIF":7.4,"publicationDate":"2025-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143030783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Impact of “Evaluating Software Complexity Measures”","authors":"Elaine J. Weyuker","doi":"10.1109/tse.2025.3533467","DOIUrl":"https://doi.org/10.1109/tse.2025.3533467","url":null,"abstract":"","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":"109 1","pages":""},"PeriodicalIF":7.4,"publicationDate":"2025-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143030696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Impact of “An applicable family of data flow testing criteria”","authors":"Elaine J. Weyuker","doi":"10.1109/tse.2025.3533393","DOIUrl":"https://doi.org/10.1109/tse.2025.3533393","url":null,"abstract":"","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":"61 1","pages":""},"PeriodicalIF":7.4,"publicationDate":"2025-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143030784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Comprehensive Fine-Tuning Large Language Models of Code for Automated Program Repair","authors":"Kai Huang;Jian Zhang;Xinlei Bao;Xu Wang;Yang Liu","doi":"10.1109/TSE.2025.3532759","DOIUrl":"10.1109/TSE.2025.3532759","url":null,"abstract":"Automated program repair (APR) research has entered the era of large language models (LLM), and researchers have conducted several empirical studies to explore the repair capabilities of LLMs for APR. Many studies adopt the zero/few-shot learning paradigm for APR, which directly use LLMs to generate the possibly correct code given its surrounding context. Though effective, the repair capabilities of LLMs based on the fine-tuning paradigm have yet to be extensively explored. Also, it remains unknown whether LLMs have the potential to repair more complicated bugs (e.g., multi-hunk bugs). To fill the gap, in the conference version of this work, we conduct an initial study on the program repair capability of million-level LLMs in the fine-tuning paradigm. We select 5 popular million-level LLMs with representative pre-training architectures, including CodeBERT, GraphCodeBERT, PLBART, CodeT5, and UniXcoder. We consider 3 typical program repair scenarios (i.e., bugs, vulnerabilities, and errors) involving 3 programming languages (i.e., Java, C/C++, and JavaScript). Our experimental results show that fine-tuning these LLMs can significantly outperform previous state-of-the-art APR tools. However, the repair capabilities of billion-level LLMs for APR remain largely unexplored. Moreover, their substantial model sizes significantly increase the computational cost of fine-tuning. While parameter-efficient fine-tuning (PEFT) techniques offer a promising solution, their effectiveness in repair tasks and the selection of appropriate PEFT strategies remain unclear. Similarly, many novel APR strategies have been developed for non-pre-trained models, yet their applicability and effectiveness on LLMs are still unexamined. To address these gaps, we extend our prior study through three key dimensions: 1) LLM4APR, which evaluates the repair capabilities of five billion-level LLM families (InCoder, CodeGeeX, CodeGen, StarCoder, and CodeLlama) under the fine-tuning paradigm; 2) PEFT4LLM, which compares full-parameter fine-tuning (FPFT) with three PEFT techniques (LoRA, AdaLoRA, and IA3) to determine optimal strategies that balance repair cost and performance of LLMs; and 3) APR4LLM, which investigates the potential of a basic neural machine translation (NMT) approach alongside three advanced repair strategies (TENURE, ITER, and KATANA) to enhance the repair capabilities of LLMs. Overall, our extensive results suggest that larger scale models typically have better repair capabilities. The LoRA technique is still the best choice for LLM4APR studies. Different repair strategies result in different repair capabilities for the foundation models, but some of the strategies that performed well on the non-pre-trained model did not show an advantage on LLMs. Besides, we released all LLMs fine-tuned with repair tasks to facilitate LLM4APR research, and we encourage researchers to develop more powerful APR tools on the basis of these repair LLMs.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":"51 4","pages":"904-928"},"PeriodicalIF":6.5,"publicationDate":"2025-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143026439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Analysis of Safety Critical Software Systems","authors":"Gerard J. Holzmann","doi":"10.1109/tse.2025.3532391","DOIUrl":"https://doi.org/10.1109/tse.2025.3532391","url":null,"abstract":"","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":"45 1","pages":""},"PeriodicalIF":7.4,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143020698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Understanding the Effect of Agile Practice Quality on Software Product Quality","authors":"Sherlock A. Licorish","doi":"10.1109/TSE.2025.3532502","DOIUrl":"10.1109/TSE.2025.3532502","url":null,"abstract":"Agile methods and associated practices have been held to deliver value to software developers and customers. Research studies have reported team productivity and software quality benefits. While such insights are helpful for understanding how agile methods add value during software development, there is need for understanding the intersection of useful practices and outcomes over project duration. This study addresses this opportunity and conducted an observation study of student projects that was complemented by the analysis of demographics data and open responses about the challenges encountered during the use of agile practices. Data from 22 student teams comprising 85 responses were analyzed using quantitative and qualitative approaches, where among our findings we observed that the use of good coding practices and quality management techniques were positively correlated with all dimensions of product quality (e.g., functionality scope and software packaging). Outcomes also reveal that software product quality was predicted by requirements scoping, team planning and communication, and coding practice. However, high levels of team planning and communication were not necessary for all software development activities. When examining project challenges, it was observed that lack of technical skills and poor time management present most challenges to project success. While these challenges may be mitigated by agile practices, such practices may themselves create unease, requiring balance during project implementation.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":"51 2","pages":"650-662"},"PeriodicalIF":6.5,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143020699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Christopher Molloy;Jeremy Banks;Steven H. H. Ding;Furkan Alaca;Philippe Charland;Andrew Walenstein
{"title":"Mecha: A Neural-Symbolic Open-Set Homogeneous Decision Fusion Approach for Zero-Day Malware Similarity Detection","authors":"Christopher Molloy;Jeremy Banks;Steven H. H. Ding;Furkan Alaca;Philippe Charland;Andrew Walenstein","doi":"10.1109/TSE.2025.3531210","DOIUrl":"10.1109/TSE.2025.3531210","url":null,"abstract":"With increasing numbers of novel malware each year, tools are required for efficient and accurate variant matching under the same family, for the purpose of effective proactive threat detection, retro-hunting, and attack campaign tracking. All of the state-of-the-art Deep Learning (DL) approaches assume that the incoming samples originate from known families and incorrectly identify novel families. Additionally, most of the existing solutions that leverage the Siamese Neural Network architecture either rely on pair-wise comparisons or computationally expensive preprocessing steps that are not scalable to a real-world malware triage volume requirement. We propose a different route, Mecha, a Neural-Symbolic Machine Learning (ML) system for malware variant matching and zero-day family detection. Mecha is comprised of an embedding network trained in two different scenarios for byte string embedding and an open-set approximate nearest neighbour algorithm for variant matching and zero-day detection. Our embedding network uses triplet loss for embedding generation and reinforcement-based Expectation Maximization (EM) learning for full deployment optimization. We conduct multiple in-sample and out-of-sample experiments to demonstrate the model's generalizability toward novel variants and families. We also show that Mecha can detect samples outside the known set of malware samples with an accuracy greater than 0.990.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":"51 2","pages":"621-637"},"PeriodicalIF":6.5,"publicationDate":"2025-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10847580","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142991509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Completeness and Consistency of Tabular Requirements: An SMT-Based Verification Approach","authors":"Claudio Menghi;Eugene Balai;Darren Valovcin;Christoph Sticksel;Akshay Rajhans","doi":"10.1109/TSE.2025.3530820","DOIUrl":"10.1109/TSE.2025.3530820","url":null,"abstract":"Tabular requirements assist with the specification of software requirements using an “if-then” paradigm and are supported by many tools. For example, the Requirements Table block in Simulink<sup>®</sup> supports writing executable specifications that can be used as test oracles to validate an implementation. But even before the development of an implementation, automatic checking of consistency and completeness of a Requirements Table can reveal errors in the specification. Fixing such errors earlier than in later development cycles avoids costly rework and additional testing efforts that would be required otherwise. As of version R2022a, Simulink<sup>®</sup> supports checking completeness and consistency of Requirements Tables when the requirements are stateless, that is, do not constrain behaviors over time. We overcome this limitation by considering Requirements Tables with both stateless and stateful requirements. This paper (i) formally defines the syntax and semantics of Requirements Tables, and their completeness and consistency, (ii) proposes eight encodings from two categories (namely, bounded and unbounded) that support stateful requirements, and (iii) implements <small>Theano</small>, a solution supporting checking completeness and consistency using these encodings. We empirically assess the effectiveness and efficiency of our encodings in checking completeness and consistency by considering a benchmark of <inline-formula><tex-math>$160$</tex-math></inline-formula> Requirements Tables for a timeout of two hours. Our results show that <small>Theano</small> can check the completeness of all the Requirements Tables in our benchmark, it can detect the inconsistency of the Requirements Tables, but it can not confirm their consistency within the timeout. We also assessed the usefulness of <small>Theano</small> in checking the consistency and completeness of 14 versions of a Requirements Table for a practical example from the automotive domain. Across these 14 versions, <small>Theano</small> could effectively detect two inconsistent and five incomplete Requirements Tables reporting a problem (inconsistency or incompleteness) for <inline-formula><tex-math>$50%$</tex-math></inline-formula> (7 out of 14) versions of the Requirements Table.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":"51 2","pages":"595-620"},"PeriodicalIF":6.5,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10844918","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142989089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}