利用Levy Flight和Greylag Goose优化增强软件进化中的跨项目缺陷预测

IF 1.8 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Journal of Software-Evolution and Process Pub Date : 2025-03-24 DOI:10.1002/smr.70013

Kripa Sekaran, Sherly Puspha Annabel Lawrence

{"title":"利用Levy Flight和Greylag Goose优化增强软件进化中的跨项目缺陷预测","authors":"Kripa Sekaran, Sherly Puspha Annabel Lawrence","doi":"10.1002/smr.70013","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>The cross-project defect prediction (CPDP) in software applications is crucial to predict defects and ensure software quality. The performance of the traditional CPDP models is degraded due to the class imbalance issue between different projects and differences in the data distribution. To overcome these limitations, a novel approach is proposed named as Levy flight–enabled greylag goose optimized UniXcoder-based stacked defect predictor (LFGGO-USDP) for the prediction of cross-project defects in the software engineering. In this paper, 23 software projects are selected from diverse datasets such as PROMISE, ReLink, AEEEM, and NASA that are preprocessed for enhancing reliability and reducing class imbalance issues. The transformation model maps source and target projects that are present in the feature space for enhancing predictive performances. During feature selection, the LF mechanism is embedded with the GGO algorithm to localize the features in the source code for enhancing diversity and minimizing local optimum issues. The integration of UniXcoder-based stacked bidirectional long short-term memory (U-SBiLSTM) is implemented as a cross-project defect predictor. The UniXcoder model extracts semantic information for source code tokenization. Then, the output of UniXcoder is fed as input to SBiLSTM, and the SBiLSTM model is applied to determine the relationship between the source code. After that, the output of UniXcoder (which contains the semantic features) is integrated with the output of SBiLSTM (which contains the sequential and temporal dependencies). After concatenating these features, the particular information is selected by using an attention mechanism for categorizing defective and nondefective classes. The experimental investigations are performed to analyze the nondefective and defective cases in software projects and numerical validation is conducted by applying different evaluation models for analyzing the superiority. The proposed model achieved the highest defect prediction accuracy of 0.986 compared to other existing approaches that demonstrates the proposed model provided better prediction outcomes.</p>\n </div>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"37 3","pages":""},"PeriodicalIF":1.8000,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Leveraging Levy Flight and Greylag Goose Optimization for Enhanced Cross-Project Defect Prediction in Software Evolution\",\"authors\":\"Kripa Sekaran, Sherly Puspha Annabel Lawrence\",\"doi\":\"10.1002/smr.70013\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n <p>The cross-project defect prediction (CPDP) in software applications is crucial to predict defects and ensure software quality. The performance of the traditional CPDP models is degraded due to the class imbalance issue between different projects and differences in the data distribution. To overcome these limitations, a novel approach is proposed named as Levy flight–enabled greylag goose optimized UniXcoder-based stacked defect predictor (LFGGO-USDP) for the prediction of cross-project defects in the software engineering. In this paper, 23 software projects are selected from diverse datasets such as PROMISE, ReLink, AEEEM, and NASA that are preprocessed for enhancing reliability and reducing class imbalance issues. The transformation model maps source and target projects that are present in the feature space for enhancing predictive performances. During feature selection, the LF mechanism is embedded with the GGO algorithm to localize the features in the source code for enhancing diversity and minimizing local optimum issues. The integration of UniXcoder-based stacked bidirectional long short-term memory (U-SBiLSTM) is implemented as a cross-project defect predictor. The UniXcoder model extracts semantic information for source code tokenization. Then, the output of UniXcoder is fed as input to SBiLSTM, and the SBiLSTM model is applied to determine the relationship between the source code. After that, the output of UniXcoder (which contains the semantic features) is integrated with the output of SBiLSTM (which contains the sequential and temporal dependencies). After concatenating these features, the particular information is selected by using an attention mechanism for categorizing defective and nondefective classes. The experimental investigations are performed to analyze the nondefective and defective cases in software projects and numerical validation is conducted by applying different evaluation models for analyzing the superiority. The proposed model achieved the highest defect prediction accuracy of 0.986 compared to other existing approaches that demonstrates the proposed model provided better prediction outcomes.</p>\\n </div>\",\"PeriodicalId\":48898,\"journal\":{\"name\":\"Journal of Software-Evolution and Process\",\"volume\":\"37 3\",\"pages\":\"\"},\"PeriodicalIF\":1.8000,\"publicationDate\":\"2025-03-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Software-Evolution and Process\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/smr.70013\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Software-Evolution and Process","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/smr.70013","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

摘要

软件应用中的跨项目缺陷预测（CPDP）是预测缺陷和保证软件质量的关键。由于不同项目之间的类不平衡问题和数据分布的差异，传统的CPDP模型的性能下降。为了克服这些限制，提出了一种新的方法，称为Levy飞行灰雁优化基于unixcoder的堆叠缺陷预测器（LFGGO-USDP），用于预测软件工程中的跨项目缺陷。本文从PROMISE、ReLink、AEEEM和NASA等不同的数据集中选择23个软件项目进行预处理，以提高可靠性和减少类不平衡问题。转换模型映射出现在特征空间中的源项目和目标项目，以增强预测性能。在特征选择过程中，将LF机制嵌入到GGO算法中，对源代码中的特征进行局部定位，以增强多样性并最小化局部最优问题。基于unixcoder的堆叠双向长短期记忆（U-SBiLSTM）集成被实现为跨项目缺陷预测器。UniXcoder模型为源代码标记提取语义信息。然后，将UniXcoder的输出作为SBiLSTM的输入，并应用SBiLSTM模型确定源代码之间的关系。之后，UniXcoder的输出（包含语义特征）与SBiLSTM的输出（包含顺序和时间依赖关系）集成。在连接这些特征之后，通过使用注意机制对缺陷和非缺陷类进行分类来选择特定的信息。通过实验研究分析了软件项目中的无缺陷和缺陷情况，并应用不同的评价模型进行了数值验证，分析了评价模型的优越性。与其他现有方法相比，该模型的缺陷预测准确率最高，为0.986，表明该模型具有较好的预测效果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Leveraging Levy Flight and Greylag Goose Optimization for Enhanced Cross-Project Defect Prediction in Software Evolution

查看原文本刊更多论文

Leveraging Levy Flight and Greylag Goose Optimization for Enhanced Cross-Project Defect Prediction in Software Evolution

The cross-project defect prediction (CPDP) in software applications is crucial to predict defects and ensure software quality. The performance of the traditional CPDP models is degraded due to the class imbalance issue between different projects and differences in the data distribution. To overcome these limitations, a novel approach is proposed named as Levy flight–enabled greylag goose optimized UniXcoder-based stacked defect predictor (LFGGO-USDP) for the prediction of cross-project defects in the software engineering. In this paper, 23 software projects are selected from diverse datasets such as PROMISE, ReLink, AEEEM, and NASA that are preprocessed for enhancing reliability and reducing class imbalance issues. The transformation model maps source and target projects that are present in the feature space for enhancing predictive performances. During feature selection, the LF mechanism is embedded with the GGO algorithm to localize the features in the source code for enhancing diversity and minimizing local optimum issues. The integration of UniXcoder-based stacked bidirectional long short-term memory (U-SBiLSTM) is implemented as a cross-project defect predictor. The UniXcoder model extracts semantic information for source code tokenization. Then, the output of UniXcoder is fed as input to SBiLSTM, and the SBiLSTM model is applied to determine the relationship between the source code. After that, the output of UniXcoder (which contains the semantic features) is integrated with the output of SBiLSTM (which contains the sequential and temporal dependencies). After concatenating these features, the particular information is selected by using an attention mechanism for categorizing defective and nondefective classes. The experimental investigations are performed to analyze the nondefective and defective cases in software projects and numerical validation is conducted by applying different evaluation models for analyzing the superiority. The proposed model achieved the highest defect prediction accuracy of 0.986 compared to other existing approaches that demonstrates the proposed model provided better prediction outcomes.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Software-Evolution and Process COMPUTER SCIENCE, SOFTWARE ENGINEERING-

自引率

10.00%

发文量

109