File-Level Defect Prediction: Unsupervised vs. Supervised Models

2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM) Pub Date : 2017-11-09 DOI:10.1109/ESEM.2017.48

Meng Yan, Yicheng Fang, D. Lo, Xin Xia, Xiaohong Zhang

{"title":"File-Level Defect Prediction: Unsupervised vs. Supervised Models","authors":"Meng Yan, Yicheng Fang, D. Lo, Xin Xia, Xiaohong Zhang","doi":"10.1109/ESEM.2017.48","DOIUrl":null,"url":null,"abstract":"Background: Software defect models can help software quality assurance teams to allocate testing or code review resources. A variety of techniques have been used to build defect prediction models, including supervised and unsupervised methods. Recently, Yang et al. [1] surprisingly find that unsupervised models can perform statistically significantly better than supervised models in effort-aware change-level defect prediction. However, little is known about relative performance of unsupervised and supervised models for effort-aware file-level defect prediction. Goal: Inspired by their work, we aim to investigate whether a similar finding holds in effort-aware file-level defect prediction. Method: We replicate Yang et al.'s study on PROMISE dataset with totally ten projects. We compare the effectiveness of unsupervised and supervised prediction models for effort-aware file-level defect prediction. Results: We find that the conclusion of Yang et al. [1] does not hold under within-project but holds under cross-project setting for file-level defect prediction. In addition, following the recommendations given by the best unsupervised model, developers needs to inspect statistically significantly more files than that of supervised models considering the same inspection effort (i.e., LOC). Conclusions: (a) Unsupervised models do not perform statistically significantly better than state-of-art supervised model under within-project setting, (b) Unsupervised models can perform statistically significantly better than state-ofart supervised model under cross-project setting, (c) We suggest that not only LOC but also number of files needed to be inspected should be considered when evaluating effort-aware filelevel defect prediction models.","PeriodicalId":213866,"journal":{"name":"2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM)","volume":"65 1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"65","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ESEM.2017.48","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 65

Abstract

Background: Software defect models can help software quality assurance teams to allocate testing or code review resources. A variety of techniques have been used to build defect prediction models, including supervised and unsupervised methods. Recently, Yang et al. [1] surprisingly find that unsupervised models can perform statistically significantly better than supervised models in effort-aware change-level defect prediction. However, little is known about relative performance of unsupervised and supervised models for effort-aware file-level defect prediction. Goal: Inspired by their work, we aim to investigate whether a similar finding holds in effort-aware file-level defect prediction. Method: We replicate Yang et al.'s study on PROMISE dataset with totally ten projects. We compare the effectiveness of unsupervised and supervised prediction models for effort-aware file-level defect prediction. Results: We find that the conclusion of Yang et al. [1] does not hold under within-project but holds under cross-project setting for file-level defect prediction. In addition, following the recommendations given by the best unsupervised model, developers needs to inspect statistically significantly more files than that of supervised models considering the same inspection effort (i.e., LOC). Conclusions: (a) Unsupervised models do not perform statistically significantly better than state-of-art supervised model under within-project setting, (b) Unsupervised models can perform statistically significantly better than state-ofart supervised model under cross-project setting, (c) We suggest that not only LOC but also number of files needed to be inspected should be considered when evaluating effort-aware filelevel defect prediction models.

查看原文本刊更多论文

文件级缺陷预测:无监督与监督模型

背景:软件缺陷模型可以帮助软件质量保证团队分配测试或代码审查资源。各种各样的技术已经被用来建立缺陷预测模型，包括监督和非监督方法。最近，Yang等人[1]令人惊讶地发现，在努力感知的变更级缺陷预测中，无监督模型的表现在统计上显著优于有监督模型。然而，对于工作感知文件级缺陷预测的非监督模型和监督模型的相对性能知之甚少。目标:受他们工作的启发，我们的目标是研究在工作感知的文件级缺陷预测中是否存在类似的发现。方法:在PROMISE数据集上复制Yang等人的研究，共10个项目。我们比较了非监督和监督预测模型在工作感知文件级缺陷预测中的有效性。结果:我们发现Yang等人[1]的结论并不适用于项目内，而适用于跨项目的文件级缺陷预测。此外，根据最佳无监督模型给出的建议，考虑到相同的检查工作(即LOC)，开发人员需要在统计上比有监督模型检查更多的文件。结论:(a)在项目内设置下，无监督模型的性能在统计上并不显著优于现有的监督模型;(b)在跨项目设置下，无监督模型的性能在统计上显著优于现有的监督模型;(c)我们建议在评估工作感知的文件级缺陷预测模型时，不仅要考虑LOC，还要考虑需要检查的文件数量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM)

自引率

0.00%

发文量