A novel defect prediction method based on semantic feature enhancement

IF 1.7 4区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING
Chi Zhang, Xiaoli Wang, Jinfu Chen, Saihua Cai, Rexford Nii Ayitey Sosu
{"title":"A novel defect prediction method based on semantic feature enhancement","authors":"Chi Zhang,&nbsp;Xiaoli Wang,&nbsp;Jinfu Chen,&nbsp;Saihua Cai,&nbsp;Rexford Nii Ayitey Sosu","doi":"10.1002/smr.2674","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Although cross-project defect prediction (CPDP) techniques that use traditional manual features to build defect prediction model have been well-developed, they usually ignore the semantic and structural information inside the program and fail to capture the hidden features that are critical for program category prediction, resulting in poor defect prediction results. Researchers have proposed using deep learning to automatically extract the semantic features of programs and fuse them with traditional features as training data. However, in practice, it is important to explore the effective representation of the semantic features in the programs and how the fusion of a reasonable ratio between the two types of features can maximize the effectiveness of the model. In this paper, we propose a semantic feature enhancement-based defect prediction framework (SFE-DP), which augments the semantic feature set extracted from the program code with data. We also introduce a layer of self-attentive mechanism and a matching layer to filter low-efficiency and non-critical semantic features in the model structure. Finally, we combine the idea of hybrid loss function to iteratively optimize the model parameters. Extensive experiments validate that SFE-DP can outperform the baseline approaches on 90 pairs of CPDP tasks formed by 10 open-source projects.</p>\n </div>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"36 9","pages":""},"PeriodicalIF":1.7000,"publicationDate":"2024-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Software-Evolution and Process","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/smr.2674","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0

Abstract

Although cross-project defect prediction (CPDP) techniques that use traditional manual features to build defect prediction model have been well-developed, they usually ignore the semantic and structural information inside the program and fail to capture the hidden features that are critical for program category prediction, resulting in poor defect prediction results. Researchers have proposed using deep learning to automatically extract the semantic features of programs and fuse them with traditional features as training data. However, in practice, it is important to explore the effective representation of the semantic features in the programs and how the fusion of a reasonable ratio between the two types of features can maximize the effectiveness of the model. In this paper, we propose a semantic feature enhancement-based defect prediction framework (SFE-DP), which augments the semantic feature set extracted from the program code with data. We also introduce a layer of self-attentive mechanism and a matching layer to filter low-efficiency and non-critical semantic features in the model structure. Finally, we combine the idea of hybrid loss function to iteratively optimize the model parameters. Extensive experiments validate that SFE-DP can outperform the baseline approaches on 90 pairs of CPDP tasks formed by 10 open-source projects.

基于语义特征增强的新型缺陷预测方法
摘要虽然利用传统人工特征建立缺陷预测模型的跨项目缺陷预测(CPDP)技术已经得到了很好的发展,但它们通常忽略了程序内部的语义和结构信息,无法捕捉到对程序类别预测至关重要的隐藏特征,导致缺陷预测结果不佳。研究人员提出利用深度学习自动提取程序的语义特征,并将其与传统特征融合作为训练数据。然而,在实际应用中,如何有效地表征程序中的语义特征,以及如何融合两类特征的合理比例,才能最大限度地提高模型的有效性,是探索的重点。在本文中,我们提出了一种基于语义特征增强的缺陷预测框架(SFE-DP),它利用数据增强了从程序代码中提取的语义特征集。我们还引入了一层自我关注机制和一个匹配层,以过滤模型结构中的低效和非关键语义特征。最后,我们结合混合损失函数的思想,对模型参数进行迭代优化。大量实验验证了 SFE-DP 在由 10 个开源项目组成的 90 对 CPDP 任务中的表现优于基线方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Software-Evolution and Process
Journal of Software-Evolution and Process COMPUTER SCIENCE, SOFTWARE ENGINEERING-
自引率
10.00%
发文量
109
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信