Revisiting ‘revisiting supervised methods for effort-aware cross-project defect prediction’

IF 1.5 4区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING
IET Software Pub Date : 2023-06-27 DOI:10.1049/sfw2.12133
Fuyang Li, Peixin Yang, Jacky Wai Keung, Wenhua Hu, Haoyu Luo, Xiao Yu
{"title":"Revisiting ‘revisiting supervised methods for effort-aware cross-project defect prediction’","authors":"Fuyang Li,&nbsp;Peixin Yang,&nbsp;Jacky Wai Keung,&nbsp;Wenhua Hu,&nbsp;Haoyu Luo,&nbsp;Xiao Yu","doi":"10.1049/sfw2.12133","DOIUrl":null,"url":null,"abstract":"<p>Effort-aware cross-project defect prediction (EACPDP), which uses cross-project software modules to build a model to rank within-project software modules based on the defect density, has been suggested to allocate limited testing resource efficiently. Recently, Ni et al. proposed an EACPDP method called EASC, which used all cross-project modules to train a model without considering the data distribution difference between cross-project and within-project data. In addition, Ni et al. employed the different defect density calculation strategies when comparing EASC and baseline methods. To explore the effective defect density calculation strategies and methods on EACPDP, the authors compare four data filtering methods and five transfer learning methods with EASC using four commonly used defect density calculation strategies. The authors use three classification evaluation metrics and seven effort-aware metrics to assess the performance of methods on 11 PROMISE datasets comprehensively. The results show that (1) The classification before sorting (CBS+) defect density calculation strategy achieves the best overall performance. (2) Using balanced distribution adaption (BDA) and joint distribution adaptation (JDA) with the K-nearest neighbour classifier to build the EACPDP model can find 15% and 14.3% more defective modules and 11.6% and 8.9% more defects while achieving the acceptable initial false alarms (IFA). (3) Better comprehensive classification performance of the methods can bring better EACPDP performance to some extent. (4) A flexible adjustment of the defect threshold <i>λ</i> of the CBS+ strategy contribute to different goals. In summary, the authors recommend researchers and practitioners use to BDA and JDA with the CBS+ strategy to build the EACPDP model.</p>","PeriodicalId":50378,"journal":{"name":"IET Software","volume":"17 4","pages":"472-495"},"PeriodicalIF":1.5000,"publicationDate":"2023-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/sfw2.12133","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IET Software","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/sfw2.12133","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 3

Abstract

Effort-aware cross-project defect prediction (EACPDP), which uses cross-project software modules to build a model to rank within-project software modules based on the defect density, has been suggested to allocate limited testing resource efficiently. Recently, Ni et al. proposed an EACPDP method called EASC, which used all cross-project modules to train a model without considering the data distribution difference between cross-project and within-project data. In addition, Ni et al. employed the different defect density calculation strategies when comparing EASC and baseline methods. To explore the effective defect density calculation strategies and methods on EACPDP, the authors compare four data filtering methods and five transfer learning methods with EASC using four commonly used defect density calculation strategies. The authors use three classification evaluation metrics and seven effort-aware metrics to assess the performance of methods on 11 PROMISE datasets comprehensively. The results show that (1) The classification before sorting (CBS+) defect density calculation strategy achieves the best overall performance. (2) Using balanced distribution adaption (BDA) and joint distribution adaptation (JDA) with the K-nearest neighbour classifier to build the EACPDP model can find 15% and 14.3% more defective modules and 11.6% and 8.9% more defects while achieving the acceptable initial false alarms (IFA). (3) Better comprehensive classification performance of the methods can bring better EACPDP performance to some extent. (4) A flexible adjustment of the defect threshold λ of the CBS+ strategy contribute to different goals. In summary, the authors recommend researchers and practitioners use to BDA and JDA with the CBS+ strategy to build the EACPDP model.

Abstract Image

重新审视“重新审视工作感知跨项目缺陷预测的监督方法”
努力感知跨项目缺陷预测(EACPDP)利用跨项目软件模块建立模型,根据缺陷密度在项目软件模块内进行排序,以有效分配有限的测试资源。最近,Ni等人。提出了一种称为EASC的EACPDP方法,该方法使用所有跨项目模块来训练模型,而不考虑跨项目数据和项目内数据之间的数据分布差异。此外,Ni等人。在比较EASC和基线方法时采用了不同的缺陷密度计算策略。为了探索EACPDP上有效的缺陷密度计算策略和方法,作者将四种数据过滤方法和五种迁移学习方法与EASC进行了比较,并使用了四种常用的缺陷密度测量策略。作者使用三个分类评估指标和七个努力感知指标来全面评估方法在11个PROMISE数据集上的性能。结果表明:(1)先分类后排序(CBS+)缺陷密度计算策略取得了最佳的整体性能。(2) 使用平衡分布自适应(BDA)和联合分布自适应(JDA)与K近邻分类器建立EACPDP模型,可以发现15%和14.3%的缺陷模块,11.6%和8.9%的缺陷,同时实现可接受的初始虚警(IFA)。(3) 更好的综合分类性能可以在一定程度上带来更好的EACPDP性能。(4) 灵活调整CBS+策略的缺陷阈值λ有助于实现不同的目标。总之,作者建议研究人员和从业者使用BDA和JDA以及CBS+策略来构建EACPDP模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IET Software
IET Software 工程技术-计算机:软件工程
CiteScore
4.20
自引率
0.00%
发文量
27
审稿时长
9 months
期刊介绍: IET Software publishes papers on all aspects of the software lifecycle, including design, development, implementation and maintenance. The focus of the journal is on the methods used to develop and maintain software, and their practical application. Authors are especially encouraged to submit papers on the following topics, although papers on all aspects of software engineering are welcome: Software and systems requirements engineering Formal methods, design methods, practice and experience Software architecture, aspect and object orientation, reuse and re-engineering Testing, verification and validation techniques Software dependability and measurement Human systems engineering and human-computer interaction Knowledge engineering; expert and knowledge-based systems, intelligent agents Information systems engineering Application of software engineering in industry and commerce Software engineering technology transfer Management of software development Theoretical aspects of software development Machine learning Big data and big code Cloud computing Current Special Issue. Call for papers: Knowledge Discovery for Software Development - https://digital-library.theiet.org/files/IET_SEN_CFP_KDSD.pdf Big Data Analytics for Sustainable Software Development - https://digital-library.theiet.org/files/IET_SEN_CFP_BDASSD.pdf
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信