{"title":"BET-BiLSTM Model: A Robust Solution for Automated Requirements Classification","authors":"Jalil Abbas, Cheng Zhang, Bin Luo","doi":"10.1002/smr.70012","DOIUrl":"https://doi.org/10.1002/smr.70012","url":null,"abstract":"<div>\u0000 \u0000 <p>Transformer methods have revolutionized software requirements classification by combining advanced natural language processing to accurately understand and categorize requirements. While traditional methods like Doc2Vec and TF-IDF are useful, they often fail to capture the deep contextual relationships and subtle meanings inherent in textual data. Transformer models possess unique strengths and weaknesses, impacting their ability to capture various aspects of the data. Consequently, relying on a single model can lead to suboptimal feature representations, limiting the overall performance of the classification task. To address this challenge, our study introduces an innovative BET-BiLSTM (balanced ensemble transformers using Bi-LSTM) model. This model combines the strengths of five transformer–based models BERT, RoBERTa, XLNet, GPT-2, and T5 through weighted averaging ensemble, resulting in a sophisticated and resilient feature set. By employing data balancing techniques, we ensure a well-distributed representation of features, addressing the issue of class imbalance. The BET-BiLSTM model plays a crucial role in the classification process, achieving an impressive accuracy of 96%. Moreover, the practical applicability of this model is validated through its successful implementation on three publicly available unlabeled datasets and one additional labeled dataset. The model significantly improved the completeness and reliability of these datasets by accurately predicting labels for previously unclassified requirements. This makes our approach a powerful tool for large-scale requirements analysis and classification tasks, outperforming traditional single-model methods and showcasing its real-world effectiveness.</p>\u0000 </div>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"37 3","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143554528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Why Not Fix This Bug? Characterizing and Identifying Bug-Tagged Issues That Are Truly Fixed by Developers","authors":"Ye Wang, Zhengru Han, Qiao Huang, Bo Jiang","doi":"10.1002/smr.70008","DOIUrl":"https://doi.org/10.1002/smr.70008","url":null,"abstract":"<div>\u0000 \u0000 <p>The GitHub issue community serves as the primary means for project developers to obtain information about program bugs, and numerous GitHub users post issues based on encountered project vulnerabilities or error messages. However, these issues often vary in quality, leading to a significant time burden on project developers. By collecting 2500 bug-related issues from five GitHub projects, we first manually analyze a large volume of issue information to formulate rules for identifying whether a bug-tagged issue is truly fixed by project developers. We find that a substantial number (ranging from 29% to 68.4% in different projects) of bug-tagged issues are not truly fixed by project developers. We empirically investigate the characteristics of such issues and summarize the reasons why they are not fixed. Then, we propose an automated approach called DFBERT to identify the bug-tagged issues that are more likely to be fixed by project developers. Our approach incorporates both text and non-text features to train a neural network-based prediction model. The experimental results show that our approach achieves an average F1-score of 0.66 in inter-project setting, and the F1-score increase to 0.77 when adding part of testing data for training.</p>\u0000 </div>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"37 2","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143489890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vita Santa Barletta, Danilo Caivano, Anibrata Pal, Michele Scalera, Manuel A. Serrano Martin
{"title":"Enabling Quantum Privacy and Security by Design: Imperatives for Contemporary State-of-the-Art in Quantum Software Engineering","authors":"Vita Santa Barletta, Danilo Caivano, Anibrata Pal, Michele Scalera, Manuel A. Serrano Martin","doi":"10.1002/smr.70005","DOIUrl":"https://doi.org/10.1002/smr.70005","url":null,"abstract":"<p>With the advent of Quantum Computing and its exponential research endeavors in the past couple of decades, we are looking at a Golden Era of Quantum Computing. We are transitioning into an age of Hybrid Classical-Quantum Computers, where the quantum computational resources are selectively harnessed for resource-intensive tasks. On the one hand, Quantum Computing promises immense future computational innovation, and it also comes with privacy and security challenges. To date, Privacy by Design (PbD) and Security by Design (SbD) frameworks and guidelines in the Quantum Software Engineering (QSE) domain are still nebulous, and there are no comprehensive studies on the same. In this study, therefore, we identify the current state-of-the-art in the relevant literature and investigate the principles of PbD and SbD in the domain of QSE. This is the first study to identify state-of-the-art Quantum PbD and Quantum SbD in QSE. Furthermore, we also identified the gaps in the current literature, which were extended into action points for a robust literature for Quantum PbD and SbD. We recognize the crucial role of researchers, academics, and professionals in the field of Quantum Computing and Software Engineering in conducting more empirical studies and shaping the future of PbD and SbD principles in QSE.</p>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"37 2","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/smr.70005","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143475660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"cvrip: A Visual GUI Ripping Framework","authors":"Heji Huang, Ju Qian, Wenduo Jia, Yiming Jin","doi":"10.1002/smr.70009","DOIUrl":"https://doi.org/10.1002/smr.70009","url":null,"abstract":"<div>\u0000 \u0000 <p>GUI ripping explores the graphical user interface of an application to build a model which can express the application behavior. The ripped GUI model is useful in various software engineering tasks. Traditional GUI ripping techniques depend on the underlying GUI frameworks to provide the GUI structure information. They are difficult to work across platforms or on nonnative applications where the GUI structure information cannot easily be obtained. This work introduces cvrip, a visual GUI ripping framework, to address the problem. cvrip visually analyzes the GUI screen for ripping and does not rely on the underlying GUI frameworks. We introduce many new techniques to enable efficient visual GUI ripping, for example, a YOLO v5-based model to detect executable widgets, a state recognition acceleration method for fast model updating, and several GUI exploration strategies taking the characteristics of imperfect visual analysis into account. Experiments are conducted to evaluate many technique choices in visual GUI ripping and compare the solution with the traditional style ripping. The results show that cvrip can get competitive exploration coverage compared to traditional approaches. This suggests visual GUI ripping is a direction worthy of more future studies.</p>\u0000 </div>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"37 2","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143475661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Eladio Domínguez, Beatriz Pérez, Ángel L. Rubio, María A. Zapata
{"title":"A Flexible Framework to Ensure Traceability, Consistency, and Propagation of KPIs Evolution","authors":"Eladio Domínguez, Beatriz Pérez, Ángel L. Rubio, María A. Zapata","doi":"10.1002/smr.70004","DOIUrl":"https://doi.org/10.1002/smr.70004","url":null,"abstract":"<div>\u0000 \u0000 <p>Organizations use key performance indicators (KPIs) to assess the effectiveness and efficiency of their procedures and processes. In a world that is constantly evolving and hyperconnected via the internet, it is of great interest to analyze how changes (organizational, legal, technological or other) can lead to modifications in the KPIs involved. However, little attention has been paid to KPI evolution either in the scientific literature or in developed solutions. This paper presents <i>A Flexible Framework for the Evolution, Consistency and Traceability of KPIs</i> (AFFECTK) that aims at establishing the basis for suitable KPIs' evolution management. The feasibility of this proposal is demonstrated through a proof-of-concept developed using a reasoning tool based on Constraint Logic Programming. The framework is further evaluated, using real KPI case studies, to assess the functional suitability of our approach.</p>\u0000 </div>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"37 2","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143438830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multilanguage Detection of Design Pattern Instances","authors":"Hugo Andrade, João Bispo, Filipe F. Correia","doi":"10.1002/smr.2738","DOIUrl":"https://doi.org/10.1002/smr.2738","url":null,"abstract":"<div>\u0000 \u0000 <p>Code comprehension is often supported by source code analysis tools that provide more abstract views over software systems, such as those detecting design patterns. These tools encompass analysis of source code and ensuing extraction of relevant information. However, the analysis of the source code is often specific to the target programming language. We propose DP-LARA, a multilanguage pattern detection tool that uses the multilanguage capability of the LARA framework to support finding pattern instances in a code base. LARA provides a virtual AST, which is common to multiple OOP programming languages, and DP-LARA then performs code analysis of detecting pattern instances on this abstract representation. We evaluate the detection performance and consistency of DP-LARA with a few software projects. Results show that a multilanguage approach does not compromise detection performance, and DP-LARA is consistent across the languages we tested it for (i.e., Java and C/C++). Moreover, by providing a virtual AST as the abstract representation, we believe to have decreased the effort of extending the tool to new programming languages and maintaining existing ones.</p>\u0000 </div>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"37 2","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143438831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mingliang Ma, Yanhui Li, Yingxin Chen, Lin Chen, Yuming Zhou
{"title":"Why and How We Combine Multiple Deep Learning Models With Functional Overlaps","authors":"Mingliang Ma, Yanhui Li, Yingxin Chen, Lin Chen, Yuming Zhou","doi":"10.1002/smr.70003","DOIUrl":"https://doi.org/10.1002/smr.70003","url":null,"abstract":"<div>\u0000 \u0000 <p>The evolution (e.g., development and maintenance) of deep learning (DL) models has attracted much attention. One of the main challenges during the development and maintenance of DL models is model training, which often requires a lot of human resources and computing power (such as labeling costs and parameter training). In recent years, to alleviate this problem, researchers have introduced the idea of software engineering (SE) into DL. Researchers consider the DL model a new type of software, borrowing the practice of traditional software reuse, that is, focusing on the reuse of DL models to improve the quality of DL model development and maintenance. This paper focuses on more complex model reuse scenarios, where developers need to combine multiple models with functional overlaps. We explore whether the model combination technique can meet the requirements for such scenarios. We have conducted an empirical study of the research scenario and found that a model composition approach was needed to meet the requirements. Furthermore, we propose a model combination method based on concatenation-parallel called MCCP. First, the multiple models' hidden layer features are connected, and then the multiple models are connected in parallel to construct a joint model with all output categories. The joint model is trained to achieve unified requirements under the limited marking cost. Through experiments on data sets in nine domains and five model structures, the following two conclusions are drawn: (1) we observe noticeable differences (38% at most) in the performance of multiple models within overlapping category data, which calls for effective model combination techniques. (2) MCCP is more effective than the baseline, which performs the best in eight of the nine domains. Our research shows that the joint model generated by combining models with overlapping functions can meet the requirements of complex model reuse scenarios.</p>\u0000 </div>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"37 2","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143424164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Approach to Develop Correct-by-Construction Business Process Models Using a Formal Domain Specific Language","authors":"Yousra Bendaly Hlaoui, Salma Ayari","doi":"10.1002/smr.2762","DOIUrl":"https://doi.org/10.1002/smr.2762","url":null,"abstract":"<div>\u0000 \u0000 <p>As the size and the complexity of business process models are an important driver of error probability, it is recommended to split large models into smaller models. Hence, we propose, in this paper, to develop business process models by refinement. A refinement is a transformation of a source model to a target model expressed in the same modeling language. This transformation should preserve the semantics of the source model to provide semantically correct target model. Thus, we propose, in this paper, a domain specific language based on Business Process Model and Notation (BPMN) language for developing by refinement business process models correct-by-construction. Hence, we propose (i) a <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <mi>B</mi>\u0000 <mi>P</mi>\u0000 <mi>M</mi>\u0000 <msub>\u0000 <mrow>\u0000 <mi>N</mi>\u0000 </mrow>\u0000 <mrow>\u0000 <mi>R</mi>\u0000 </mrow>\u0000 </msub>\u0000 </mrow>\u0000 <annotation>$$ BPM{N}_R $$</annotation>\u0000 </semantics></math> formal syntax throughout a context-free grammar <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <msub>\u0000 <mrow>\u0000 <mi>G</mi>\u0000 </mrow>\u0000 <mrow>\u0000 <mi>B</mi>\u0000 <mi>P</mi>\u0000 <mi>M</mi>\u0000 <msub>\u0000 <mrow>\u0000 <mi>N</mi>\u0000 </mrow>\u0000 <mrow>\u0000 <mi>R</mi>\u0000 </mrow>\u0000 </msub>\u0000 </mrow>\u0000 </msub>\u0000 </mrow>\u0000 <annotation>$$ {G}_{BPM{N}_R} $$</annotation>\u0000 </semantics></math>, (ii) axiomatic semantics to ensure the refinement correction when building business process models, (iii) operational semantics in terms of Kripke structure permitting formal verification of provided <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <mi>B</mi>\u0000 <mi>P</mi>\u0000 <mi>M</mi>\u0000 <msub>\u0000 <mrow>\u0000 <mi>N</mi>\u0000 </mrow>\u0000 <mrow>\u0000 <mi>R</mi>\u0000 </mrow>\u0000 </msub>\u0000 </mrow>\u0000 <annotation>$$ BPM{N}_R $$</annotation>\u0000 </semantics></math> models to check their reliability. The Kripke structure supports the verification of behavioral requirements represented by the Computational Tree Logic (CTL) temporal logic and verified by NuSMV model checker. Based on these semantics, we prove the validity ","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"37 2","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143389002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Casper van Schothorst, Slinger Jansen, Liza Lausberg
{"title":"The Indispensable Role of Software Ecosystem Services","authors":"Casper van Schothorst, Slinger Jansen, Liza Lausberg","doi":"10.1002/smr.70002","DOIUrl":"https://doi.org/10.1002/smr.70002","url":null,"abstract":"<p>Software ecosystem services are essential for the sustainability and functionality of software ecosystems, but they lack comprehensive categorization, hindering further study. This study explores the concept of software ecosystem services through a systematic literature review and brief survey. Drawing analogies from natural ecosystems, we define software ecosystem services as the conditions and processes through which software ecosystems create, provide, and sustain innovation and value creation via software. Software ecosystem services are categorized into four primary types: provisioning, regulating, cultural, and supporting services.</p><p>Our findings highlight the crucial role of services that do not directly add customer value but are essential for the software ecosystem's functionality, such as authentication and authorization services, collaboration and communication platforms, and app stores. By highlighting these vital yet often overlooked services, the research identifies potential sustainability threats for software ecosystems, such as the dominance of a few major players, which mirrors the risks of monocultures in natural ecosystems. This study lays the groundwork for further research aimed at ensuring the long-term sustainability and resilience of software ecosystems.</p>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"37 2","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/smr.70002","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143362482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Influencing Factors' Analysis for the Performance of Parallel Evolutionary Test Case Generation for Web Applications","authors":"Weiwei Wang, Shukai Zhang, Kepeng Qiu, Xuejun Liu, Xiaodan Li, Ruilian Zhao","doi":"10.1002/smr.2751","DOIUrl":"https://doi.org/10.1002/smr.2751","url":null,"abstract":"<div>\u0000 \u0000 <p>Evolutionary test case generation plays a vital role in ensuring software quality and reliability. Since Web applications involve a large number of interactions between client and server, the dynamic evolutionary test case generation is very time-consuming, which makes it difficult to apply in actual projects. Obviously, parallelization provides a feasible way to improve the efficiency and effectiveness of evolutionary test generation. In our previous research, the idea of parallelism has been introduced into the evolutionary test generation for Web applications. However, its performance is affected by many factors, such as migration scale, migration frequency, the number of browser processes and subpopulations, and so on. The analysis of influencing factors can guide enhancing the performance of evolutionary test generation. For this reason, this paper analyzes the factors that influence parallel evolutionary algorithms and how they affect the performance of test generation for Web applications. At the same time, different parallel evolutionary test generation methods are designed and implemented. Experiments are conducted on open-source Web applications to generate test cases that meet the server-side sensitive paths coverage criterion, providing guidance and suggestions for the parameter setting of parallel evolutionary test case generation for Web applications. The experimental results show that (1) compared with the global parallelization model, the evolutionary algorithm based on the parallel island model has a greater improvement in test case generation performance. In more detail, when generating test cases with the same server-side sensitive paths coverage, the number of iterations required is reduced by 49.6%, and the time cost is reduced by 58.7%; (2) for the test case generation based on the parallel island model, if the migration scale is large, appropriately increasing the migration frequency can reduce its time cost; (3) if the number of subpopulations is fixed, appropriately increasing the number of browser processes can reduce the time cost of Web application test case evolution, but the number of browser processes should not be too large; otherwise, it may increase the time cost.</p>\u0000 </div>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"37 2","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143248387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}