Code Churn: A Neglected Metric in Effort-Aware Just-in-Time Defect Prediction

2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM) Pub Date : 2017-11-09 DOI:10.1109/ESEM.2017.8

Jinping Liu, Yuming Zhou, Yibiao Yang, Hongmin Lu, Baowen Xu

{"title":"Code Churn: A Neglected Metric in Effort-Aware Just-in-Time Defect Prediction","authors":"Jinping Liu, Yuming Zhou, Yibiao Yang, Hongmin Lu, Baowen Xu","doi":"10.1109/ESEM.2017.8","DOIUrl":null,"url":null,"abstract":"Background: An increasing research effort has devoted to just-in-time (JIT) defect prediction. A recent study by Yang et al. at FSE'16 leveraged individual change metrics to build unsupervised JIT defect prediction model. They found that many unsupervised models performed similarly to or better than the state-of-the-art supervised models in effort-aware JIT defect prediction. Goal: In Yang et al.'s study, code churn (i.e. the change size of a code change) was neglected when building unsupervised defect prediction models. In this study, we aim to investigate the effectiveness of code churn based unsupervised defect prediction model in effort-aware JIT defect prediction. Methods: Consistent with Yang et al.'s work, we first use code churn to build a code churn based unsupervised model (CCUM). Then, we evaluate the prediction performance of CCUM against the state-of-the-art supervised and unsupervised models under the following three prediction settings: cross-validation, time-wise cross-validation, and cross-project prediction. Results: In our experiment, we compare CCUM against the state-of-the-art supervised and unsupervised JIT defect prediction models. Based on six open-source projects, our experimental results show that CCUM performs better than all the prior supervised and unsupervised models. Conclusions: The result suggests that future JIT defect prediction studies should use CCUM as a baseline model for comparison when a novel model is proposed.","PeriodicalId":213866,"journal":{"name":"2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"46","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ESEM.2017.8","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 46

Abstract

Background: An increasing research effort has devoted to just-in-time (JIT) defect prediction. A recent study by Yang et al. at FSE'16 leveraged individual change metrics to build unsupervised JIT defect prediction model. They found that many unsupervised models performed similarly to or better than the state-of-the-art supervised models in effort-aware JIT defect prediction. Goal: In Yang et al.'s study, code churn (i.e. the change size of a code change) was neglected when building unsupervised defect prediction models. In this study, we aim to investigate the effectiveness of code churn based unsupervised defect prediction model in effort-aware JIT defect prediction. Methods: Consistent with Yang et al.'s work, we first use code churn to build a code churn based unsupervised model (CCUM). Then, we evaluate the prediction performance of CCUM against the state-of-the-art supervised and unsupervised models under the following three prediction settings: cross-validation, time-wise cross-validation, and cross-project prediction. Results: In our experiment, we compare CCUM against the state-of-the-art supervised and unsupervised JIT defect prediction models. Based on six open-source projects, our experimental results show that CCUM performs better than all the prior supervised and unsupervised models. Conclusions: The result suggests that future JIT defect prediction studies should use CCUM as a baseline model for comparison when a novel model is proposed.

查看原文本刊更多论文

代码流失:在努力感知的即时缺陷预测中被忽视的度量

背景:越来越多的研究致力于准时(JIT)缺陷预测。Yang等人最近在FSE'16上的一项研究利用个人变更度量来构建无监督JIT缺陷预测模型。他们发现，在工作感知JIT缺陷预测中，许多无监督模型的表现与最先进的有监督模型相似，甚至更好。目标:在Yang等人的研究中，在构建无监督缺陷预测模型时忽略了代码变动(即代码变更的大小)。在本研究中，我们旨在研究基于代码混乱的无监督缺陷预测模型在努力感知JIT缺陷预测中的有效性。方法:与Yang等人的工作一致，我们首先使用代码流失来构建基于代码流失的无监督模型(CCUM)。然后，在交叉验证、时间交叉验证和跨项目预测三种预测设置下，我们评估了CCUM与最先进的监督和无监督模型的预测性能。结果:在我们的实验中，我们将CCUM与最先进的有监督的和无监督的JIT缺陷预测模型进行比较。基于六个开源项目的实验结果表明，CCUM的性能优于所有先前的监督和无监督模型。结论:结果表明，当提出一个新的模型时，未来的JIT缺陷预测研究应该使用CCUM作为基线模型进行比较。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM)

自引率

0.00%

发文量