Software change classification using hunk metrics

2009 IEEE International Conference on Software Maintenance Pub Date : 2009-10-30 DOI:10.1109/ICSM.2009.5306274

Javed Ferzund, S. Ahsan, F. Wotawa

引用次数: 8

Abstract

Change management is a challenging task in software maintenance. Changes are made to the software during its whole life. Some of these changes introduce errors in the code which result in failures. Software changes are composed of small code units called hunks, dispersed in source code files. In this paper we present a technique for classifying software changes based on hunk metrics. We classify individual hunks as buggy or bug-free, thus we provide an approach for bug prediction at the smallest level of granularity. We introduce a set of hunk metrics and build classification models based on these metrics. Classification models are built using logistic regression and random forests. We evaluated the performance of our approach on 7 open source software projects. Our classification approach can classify hunks as buggy or bug free with 81 percent accuracy, 77 percent buggy hunk precision and 67 percent buggy hunk recall on average. Most of the hunk metrics are significant predictors of bugs but the set of significant metrics varies among different projects.

查看原文本刊更多论文

使用块度量的软件变更分类

变更管理是软件维护中一项具有挑战性的任务。更改是在软件的整个生命周期中进行的。其中一些更改会在代码中引入错误，从而导致失败。软件更改由称为块的小代码单元组成，分散在源代码文件中。在本文中，我们提出了一种基于块度量对软件变更进行分类的技术。我们将单个块分类为有bug或无bug，因此我们提供了一种在最小粒度级别上进行bug预测的方法。我们引入了一组大块指标，并基于这些指标构建了分类模型。使用逻辑回归和随机森林建立分类模型。我们在7个开源软件项目中评估了我们的方法的性能。我们的分类方法可以以81%的准确率，77%的错误块精度和67%的错误块召回率将块分类为有bug或无bug。大多数大块指标都是bug的重要预测指标，但重要指标的集合在不同的项目中是不同的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2009 IEEE International Conference on Software Maintenance

自引率

0.00%

发文量