{"title":"Semantic similarity loss for neural source code summarization","authors":"Chia-Yi Su, Collin McMillan","doi":"10.1002/smr.2706","DOIUrl":"10.1002/smr.2706","url":null,"abstract":"<p>This paper presents a procedure for and evaluation of using a semantic similarity metric as a loss function for neural source code summarization. Code summarization is the task of writing natural language descriptions of source code. Neural code summarization refers to automated techniques for generating these descriptions using neural networks. Almost all current approaches involve neural networks as either standalone models or as part of a pretrained large language models, for example, GPT, Codex, and LLaMA. Yet almost all also use a categorical cross-entropy (CCE) loss function for network optimization. Two problems with CCE are that (1) it computes loss over each word prediction one-at-a-time, rather than evaluating a whole sentence, and (2) it requires a perfect prediction, leaving no room for partial credit for synonyms. In this paper, we extend our previous work on semantic similarity metrics to show a procedure for using semantic similarity as a loss function to alleviate this problem, and we evaluate this procedure in several settings in both metrics-driven and human studies. In essence, we propose to use a semantic similarity metric to calculate loss over the whole output sentence prediction per training batch, rather than just loss for each word. We also propose to combine our loss with CCE for each word, which streamlines the training process compared to baselines. We evaluate our approach over several baselines and report improvement in the vast majority of conditions.</p>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"36 11","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141570375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Object Constraint Language based test case optimization with modified Average Percentage of Fault Detection metric","authors":"Kunxiang Jin, Kevin Lano","doi":"10.1002/smr.2708","DOIUrl":"10.1002/smr.2708","url":null,"abstract":"<p>Testing is one of the most time-consuming and unpredictable processes within the software development life cycle. As a result, many test case optimization (TCO) techniques have been proposed to make this process more scalable. Object Constraint Language (OCL) was initially introduced as a constraint language to provide additional details to Unified Modeling Language models. However, as OCL continues to evolve, an increasing number of systems are being expressed by this language. Despite this growth, a noticeable research gap exists for the testing of systems whose specifications are expressed in OCL. In our previous work, we verified the effectiveness and efficiency of performing the test case prioritization (TCP) process for these systems. In this study, we extend our previous work by integrating the test case minimization (TCM) process to determine whether TCM can also benefit the testing process under the context of OCL. The evaluation of TCO approaches often relies on well-established metrics such as the average percentage of fault detection (APFD). However, the suitability of APFD for model-based testing (MBT) is not ideal. This paper addresses this limitation by proposing a modification to the APFD metric to enhance its viability for MBT scenarios. We conducted four case studies to evaluate the feasibility of integrating the TCM and TCP processes into our proposed approach. In these studies, we applied the multi-objective optimization algorithm NSGA-II and the genetic algorithm independently to the TCM and TCP processes. The objective was to assess the effectiveness and efficiency of combining TCM and TCP in enhancing the testing phase. Through experimental analysis, the results highlight the benefits of integrating TCM and TCP in the context of OCL-based testing, providing valuable insights for practitioners and researchers aiming to optimize their testing efforts. Specifically, the main contributions of this work include the following: (1) we introduce the integration of the TCM process into the TCO process for systems expressed by OCL. This integration benefits the testing process further by reducing redundant test cases while ensuring sufficient coverage. (2) We comprehensively analyze the limitations associated with the commonly used metric, APFD, and then, a modified version of the APFD metric has been proposed to overcome these weaknesses. (3). We systematically evaluate the effectiveness and efficiency of OCL-based TCO processes on four real-world case studies with different complexities.</p>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"36 11","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/smr.2708","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141549430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A rule-based method to effectively adopt robotic process automation","authors":"Maxime Bédard, Abderrahmane Leshob, Imen Benzarti, Hafedh Mili, Raqeebir Rab, Omar Hussain","doi":"10.1002/smr.2709","DOIUrl":"10.1002/smr.2709","url":null,"abstract":"<p>Robotic Process Automation (RPA) is an emerging software technology for automating business processes. RPA uses software robots to perform repetitive and error-prone tasks previously done by human actors quickly and accurately. These robots mimic humans by interacting with existing software applications through user interfaces (UI). The goal of RPA is to relieve employees from repetitive and tedious tasks to increase productivity and to provide better service quality. Yet, despite all the RPA benefits, most organizations fail to adopt RPA. One of the main reasons for the lack of adoption is that organizations are unable to effectively identify the processes that are suitable for RPA. This paper proposes a new method, called Rule-based robotic process analysis (RRPA), that assists process automation practitioners to classify business processes according to their suitability for RPA. The RRPA method computes a suitability score for RPA using a combination of two RPA goals: (i) the RPA feasibility, which assesses the extent to which the process or the activity lends itself to automation with RPA and (ii) the RPA relevance, which assesses whether the RPA automation is worthwhile. We tested the RRPA method on a set of 13 processes. The results showed that the method is effective at 82.05% and efficient at 76.19%.</p>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"36 11","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/smr.2709","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141549431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Promoting social and human factors through a gamified automotive software development environment","authors":"Gloria Piedad Gasca-Hurtado, Mirna Muñoz, Samer Sameh","doi":"10.1002/smr.2704","DOIUrl":"10.1002/smr.2704","url":null,"abstract":"<p>Gamification is an attractive strategy for different contexts, including software process improvement, where it presents positive results associated with increased factors such as motivation and others classified into social and human factors. Such factors are required to improve software processes in the automotive industry due to the influence of changes in the conditions and the behavior of individuals. However, the treatment of gamification strategies requires rigor at a scientific level. Therefore, it is necessary to analyze critical dimensions such as the gamification maturity level, the ability to intervene, and the influence of social and human factors. Such analysis is motivated by the relationship between social and human factors and the success of a process improvement. The above justifies the researchers' interest in this article in analyzing a gamification strategy implemented in the automotive industry from such dimensions. Therefore, this article presents the analysis from the point of view of developing software-controlled systems in automobiles. Besides, it uses a deductive approach to conduct this analysis to abstract all the design aspects of a strategy created and implemented in a software development automotive environment. One of the most representative findings of this study is the strategy's capacity to promote SHF, which identifies motivation, commitment, team cohesion, emotional intelligence, and autonomy.</p>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"36 11","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141549370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chaymae Miloudi, Laila Cheikhi, Ali Idri, Alain Abran
{"title":"On the value of instance selection for bug resolution prediction performance","authors":"Chaymae Miloudi, Laila Cheikhi, Ali Idri, Alain Abran","doi":"10.1002/smr.2710","DOIUrl":"10.1002/smr.2710","url":null,"abstract":"<p>Software maintenance is a challenging and laborious software management activity, especially for open-source software. The bugs reports of such software allow tracking maintenance activities and were used in several empirical studies to better predict the bug resolution effort. These reports are known for their large size and contain nonrelevant instances that need to be preprocessed to be suitable for use. To this end, instance selection (IS) has been proposed in the literature as a way to reduce the size of the datasets, while keeping the relevant instances. The objective of this study is to perform an empirical study that investigates the impact of data preprocessing through IS on the performance of bug resolution prediction classifiers. To deal with this, four IS algorithms, namely, edited nearest neighbor (ENN), repeated ENN, all-k nearest neighbors, and model class selection, are applied on five large datasets, together with five machine learning techniques. Overall, 125 experiments were performed and compared. The findings of this study highlight the positive impact of IS in providing better estimates for bug resolution prediction classifiers, in particular using repeated ENN and ENN algorithms.</p>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"36 11","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141552701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Selina Demi, Mary Sánchez-Gordón, Monica Kristiansen, Xabier Larrucea
{"title":"Trustworthy and collaborative traceability management: Experts’ feedback on a blockchain-enabled framework","authors":"Selina Demi, Mary Sánchez-Gordón, Monica Kristiansen, Xabier Larrucea","doi":"10.1002/smr.2707","DOIUrl":"10.1002/smr.2707","url":null,"abstract":"<p>Blockchain technology has attracted significant attention in both academia and industry. Recently, the application of blockchain has been advocated in software engineering. The global software engineering paradigm exacerbates trust issues, as distributed and cross-organizational teams need to share software artifacts. In such a context, there is a need for a decentralized yet reliable traceability knowledge base to keep track of what/how/when/by whom software artifacts were created or changed. This study presents a blockchain-enabled framework for trustworthy and collaborative traceability management and identifies benefits, challenges, and potential improvements based on the feedback of software engineering experts. A qualitative approach was followed in this study through semistructured interviews with software engineering (SE) experts. Transcripts were analyzed by applying the content analysis technique. The results indicated the emergence of five categories, further grouped into three main categories: experts' perceptions, blockchain-based software process improvement, and experts' recommendations. In addition, the findings suggested four archetypes of organizations that may be interested in blockchain technology: distributed organizations, organizations with contract-based projects, organizations in regulated domains, and regulators who may push the use of this technology. Further efforts should be devoted to the integration of the proposal with tools used throughout the software development lifecycle and leveraging the potential of smart contracts in validating the implementation of requirements automatically.</p>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"36 11","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/smr.2707","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141514087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Software maintenance practices using agile methods towards cloud environment: A systematic mapping","authors":"Mohammed Almashhadani, Alok Mishra, Ali Yazici","doi":"10.1002/smr.2698","DOIUrl":"10.1002/smr.2698","url":null,"abstract":"<p>Agile methods have emerged to overcome the obstacles of structured methodologies, such as the waterfall, prototype, spiral, and so on. There are studies showing the usefulness of agile approaches in software development. However, studies on Agile maintenance are very limited in number. Regardless of the chosen methodology, software maintenance can be carried out in either a local (on-the-premise) or global (distributed) environment. In a local environment, the software maintenance team is co-located on the same premises, while in a global environment, the team is geographically dispersed from the customer. The main objective of this Systematic Mapping (SM) study is to identify the practices useful for software maintenance using the Agile approaches in the Cloud environment. We have conducted a comprehensive search in well-known digital databases and examined the articles that map to the pre-defined inclusion criteria. The study selected and analyzed 48 articles out of 320 published between 2000 and 2022. The findings of the mapping study reveal that Agile can resolve the major issues faced in traditional software maintenance, making the role of this approach significant in global/distributed software maintenance. Cloud computing plays a vital role in software maintenance. Most of the studies highlight the application of XP- and Scrum-based Agile maintenance models. The study found a need for more Agile maintenance solutions in the cloud, highlighting the importance of agile in software maintenance, both locally and globally. Irrespective of the environment, Cloud computing provides a centralized platform for collaboration and communication, while also offering scalability and flexibility to adapt to diverse infrastructure needs. This allows agile maintenance practices to be implemented across both local and global environments, leveraging the cloud's capabilities to overcome geographical and infrastructural challenges.</p>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"36 11","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/smr.2698","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141514086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Towards a security-optimized approach for the microservice-oriented decomposition","authors":"Xiaodong Liu, Zhikun Chen, Yu Qian, Chenxing Zhong, Huang Huang, Shanshan Li, Dong Shao","doi":"10.1002/smr.2670","DOIUrl":"10.1002/smr.2670","url":null,"abstract":"<p>Microservice architecture (MSA) is a mainstream architectural style due to its high maintainability and scalability. In practice, an appropriate microservice-oriented decomposition is the foundation to make a system enjoy the benefits of MSA. In terms of decomposing monolithic systems into microservices, researchers have been exploring many optimization objectives, of which modularity is a predominantly focused quality attribute. Security is also a critical quality attribute, that measures the extent to which a system protects data from malicious access or use by attackers. Considering security in microservices-oriented decomposition can help avoid the risk of leaking critical data and other unexpected software security issues. However, few researchers consider the security objective during microservice-oriented decomposition, because the measurement of security and the trade-off with other objectives are challenging in reality. To bridge this research gap, we propose a security-optimized approach for microservice-oriented decomposition (So4MoD). In this approach, we adapt five metrics from previous studies for the measurement of the data security of candidate microservices. A multi-objective optimization algorithm based on NSGA-II is designed to search for microservices with optimized security and modularity. To validate the effectiveness of the proposed So4MoD, we perform several experiments on eight open-source projects and compare the decomposition results to other three state-of-the-art approaches, that is, FoSCI, CO-GCN, and MSExtractor. The experiment results show that our approach can achieve at least an 11.5% improvement in terms of security metrics. Moreover, the decomposition results of So4MoD outperform other approaches in four modularity metrics, demonstrating that So4MoD can optimize data security while pursuing a well-modularized MSA.</p>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"36 10","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141506978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yifan Wu, Chendong Lin, An Liu, Lei Zhao, Xiaofang Zhang
{"title":"Crowdsourced bug report severity prediction based on text and image understanding via heterogeneous graph convolutional networks","authors":"Yifan Wu, Chendong Lin, An Liu, Lei Zhao, Xiaofang Zhang","doi":"10.1002/smr.2705","DOIUrl":"10.1002/smr.2705","url":null,"abstract":"<p>In the process of crowdsourced testing, massive bug reports are submitted. Among them, the severity level of the bug report is an important indicator for traigers of crowdsourced platforms to arrange the order of reports efficiently so that developers can prioritize high-severity defects. A lot of work has been devoted to the study of automatically assigning severity levels to a large number of bug reports in crowdsourcing test systems. The research objects of these works are standard bug reports, focusing on the text part of the report, using various feature engineering methods and classification techniques. However, while achieving good performance, these methods still need to overcome two challenges: no consideration of image information in mobile testing and discontinuous semantic information of words in bug reports. In this paper, we propose a new method of severity prediction by using heterogeneous graph convolutional networks with screenshots (SPHGCN-S), which combines text features and screenshots information to understand the report more comprehensively. In addition, our approach applies the heterogeneous graph convolutional network (HGCN) architecture, which can capture the global word information to alleviate the semantic problem of word discontinuity and underlying relations between reports. We conduct a comprehensive study to compare seven commonly adopted bug report severity prediction methods with our approach. The experimental results show that our approach SPHGCN-S can improve severity prediction performance and effectively predict reports with high severity.</p>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"36 11","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141507010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Evaluation of the fixed-point iteration of minimizing delta debugging","authors":"Dániel Vince, Ákos Kiss","doi":"10.1002/smr.2702","DOIUrl":"10.1002/smr.2702","url":null,"abstract":"<p>The minimizing Delta Debugging (DDMIN) was among the first algorithms designed to automate the task of reducing test cases. Its popularity is based on the characteristics that it works on any kind of input, without knowledge about the input structure. Several studies proved that smaller outputs can be produced faster with more advanced techniques (e.g., building a tree representation of the input and reducing that data structure); however, if the structure is unknown or changing frequently, maintaining the descriptors might not be resource-efficient. Therefore, in this paper, we focus on the evaluation of the novel fixed-point iteration of minimizing Delta Debugging (DDMIN*) on publicly available test suites related to software engineering. Our experiments show that DDMIN* can help reduce inputs further by 48.08% on average compared to DDMIN (using lines as the units of the reduction). Although the effectiveness of the algorithm improved, it comes with the cost of additional testing steps. This study shows how the characteristics of the input affect the results and when it pays off using DDMIN*.</p>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"36 10","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141506979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}