Assessing uncertain predictions of software quality

Proceedings Sixth International Software Metrics Symposium (Cat. No.PR00403) Pub Date : 1999-11-04 DOI:10.1109/METRIC.1999.809737

T. Khoshgoftaar, E. B. Allen, Xiaojin Yuan, W. Jones, J. Hudepohl

{"title":"Assessing uncertain predictions of software quality","authors":"T. Khoshgoftaar, E. B. Allen, Xiaojin Yuan, W. Jones, J. Hudepohl","doi":"10.1109/METRIC.1999.809737","DOIUrl":null,"url":null,"abstract":"Many development organizations try to minimize faults in software as a means for improving customer satisfaction. Assuring high software quality often entails time-consuming and costly development processes. A software quality model based on software metrics can be used to guide enhancement efforts by predicting which modules are fault-prone. The paper presents a way to determine which predictions by a classification tree should be considered uncertain. We conducted a case study of a large legacy telecommunications system. One release was the basis for the training data set, and the subsequent release was the basis for the evaluation data set. We built a classification tree using the TREEDISC algorithm, which is based on chi-squared tests of contingency tables. The model predicted whether a module was likely to have faults discovered by customers, or not, based on software product, process, and execution metrics. We simulated practical use of the model by classifying the modules in the evaluation data set. The model achieved useful accuracy, in spite of the very small proportion of fault-prone modules in the system. We assessed whether the classes assigned to the leaves were appropriate by examining the details of the full tree, and found sizable subsets of modules with substantially uncertain classification. Discovering which modules have uncertain classifications allows sophisticated enhancement strategies to resolve uncertainties. Moreover, TREEDISC is especially well suited to identifying uncertain classifications.","PeriodicalId":372331,"journal":{"name":"Proceedings Sixth International Software Metrics Symposium (Cat. No.PR00403)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1999-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings Sixth International Software Metrics Symposium (Cat. No.PR00403)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/METRIC.1999.809737","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 15

Abstract

Many development organizations try to minimize faults in software as a means for improving customer satisfaction. Assuring high software quality often entails time-consuming and costly development processes. A software quality model based on software metrics can be used to guide enhancement efforts by predicting which modules are fault-prone. The paper presents a way to determine which predictions by a classification tree should be considered uncertain. We conducted a case study of a large legacy telecommunications system. One release was the basis for the training data set, and the subsequent release was the basis for the evaluation data set. We built a classification tree using the TREEDISC algorithm, which is based on chi-squared tests of contingency tables. The model predicted whether a module was likely to have faults discovered by customers, or not, based on software product, process, and execution metrics. We simulated practical use of the model by classifying the modules in the evaluation data set. The model achieved useful accuracy, in spite of the very small proportion of fault-prone modules in the system. We assessed whether the classes assigned to the leaves were appropriate by examining the details of the full tree, and found sizable subsets of modules with substantially uncertain classification. Discovering which modules have uncertain classifications allows sophisticated enhancement strategies to resolve uncertainties. Moreover, TREEDISC is especially well suited to identifying uncertain classifications.

查看原文本刊更多论文

评估软件质量的不确定预测

许多开发组织试图将软件中的错误最小化，以此作为提高客户满意度的一种手段。保证高质量的软件通常需要耗费时间和成本的开发过程。基于软件度量的软件质量模型可以通过预测哪些模块容易出错来指导增强工作。本文提出了一种方法来确定分类树的哪些预测应该被认为是不确定的。我们对一个大型遗留电信系统进行了案例研究。一个版本是训练数据集的基础，随后的版本是评估数据集的基础。我们使用TREEDISC算法构建了一个分类树，该算法基于列联表的卡方检验。该模型根据软件产品、过程和执行度量来预测模块是否有可能被客户发现错误。通过对评估数据集中的模块进行分类，模拟了该模型的实际应用。尽管系统中易故障模块的比例很小，但该模型仍取得了有用的准确性。我们通过检查整个树的细节来评估分配给叶子的类是否合适，并发现了相当大的分类不确定的模块子集。发现哪些模块具有不确定的分类，可以使用复杂的增强策略来解决不确定性。此外，TREEDISC特别适合于识别不确定的分类。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings Sixth International Software Metrics Symposium (Cat. No.PR00403)

自引率

0.00%

发文量