Evaluating Performance of Software Defect Prediction Models Using Area Under Precision-Recall Curve (AUC-PR)

2019 2nd International Conference on Advancements in Computational Sciences (ICACS) Pub Date : 2019-02-01 DOI:10.23919/ICACS.2019.8689135

Shahzad Ali Khan, Z. Rana

{"title":"Evaluating Performance of Software Defect Prediction Models Using Area Under Precision-Recall Curve (AUC-PR)","authors":"Shahzad Ali Khan, Z. Rana","doi":"10.23919/ICACS.2019.8689135","DOIUrl":null,"url":null,"abstract":"Software defect prediction (SDP) models are used to improve effort and testing estimate of software by identifying defective modules beforehand. Precision, recall/true positive rate and false positive rate have been used to evaluate the performance of models. In literature, area under receiver operating characteristic curve (AUC-ROC) has been used to evaluate the model performance. The standard learning goal of the defect model is to optimize the (AUC-ROC). Use of this measure has also been advocated in numerous benchmarking studies. The literature has discussed the performance bar (or so-called ceiling effect) of AUC-ROC targeted models. The literature has also indicated the use of area under precision recall curve (AUC-PR) as an evaluation parameter for the models. This study investigates if AUC-PR curve gives different information regarding model performance. To this end this study ranks the existing models based on AUC-ROC and AUC-PR and report the change in ranking of these models. The change in ranking gives an opportunity to study if the ceiling effect can be managed and AUC-PR (instead of AUC-ROC) can be considered as a goal for the prediction models. AUC-PR based evaluation of the models can help avoid the extra cost, time, and effort employed to test non-defective modules.","PeriodicalId":290819,"journal":{"name":"2019 2nd International Conference on Advancements in Computational Sciences (ICACS)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 2nd International Conference on Advancements in Computational Sciences (ICACS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/ICACS.2019.8689135","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 16

Abstract

Software defect prediction (SDP) models are used to improve effort and testing estimate of software by identifying defective modules beforehand. Precision, recall/true positive rate and false positive rate have been used to evaluate the performance of models. In literature, area under receiver operating characteristic curve (AUC-ROC) has been used to evaluate the model performance. The standard learning goal of the defect model is to optimize the (AUC-ROC). Use of this measure has also been advocated in numerous benchmarking studies. The literature has discussed the performance bar (or so-called ceiling effect) of AUC-ROC targeted models. The literature has also indicated the use of area under precision recall curve (AUC-PR) as an evaluation parameter for the models. This study investigates if AUC-PR curve gives different information regarding model performance. To this end this study ranks the existing models based on AUC-ROC and AUC-PR and report the change in ranking of these models. The change in ranking gives an opportunity to study if the ceiling effect can be managed and AUC-PR (instead of AUC-ROC) can be considered as a goal for the prediction models. AUC-PR based evaluation of the models can help avoid the extra cost, time, and effort employed to test non-defective modules.

查看原文本刊更多论文

利用精确度-召回率曲线下面积评价软件缺陷预测模型的性能

软件缺陷预测(SDP)模型通过预先识别缺陷模块来改进软件的工作量和测试估计。准确率、召回率/真阳性率和假阳性率被用来评估模型的性能。在文献中，采用受试者工作特征曲线下面积(AUC-ROC)来评价模型的性能。缺陷模型的标准学习目标是优化(AUC-ROC)。在许多基准研究中也提倡使用这一措施。文献讨论了AUC-ROC目标模型的性能条(或所谓的天花板效应)。文献还表明，使用精确召回曲线下面积(AUC-PR)作为模型的评价参数。本研究探讨AUC-PR曲线是否提供了关于模型性能的不同信息。为此，本研究基于AUC-ROC和AUC-PR对现有模型进行排名，并报告这些模型排名的变化。排名的变化提供了一个机会来研究天花板效应是否可以被管理，并且可以将AUC-PR(而不是AUC-ROC)视为预测模型的目标。基于AUC-PR的模型评估可以帮助避免额外的成本、时间和用于测试无缺陷模块的努力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2019 2nd International Conference on Advancements in Computational Sciences (ICACS)

自引率

0.00%

发文量