Software quality classification modeling using the SPRINT decision tree algorithm

14th IEEE International Conference on Tools with Artificial Intelligence, 2002. (ICTAI 2002). Proceedings. Pub Date : 2002-11-04 DOI:10.1109/TAI.2002.1180826

T. Khoshgoftaar, Naeem Seliya

{"title":"Software quality classification modeling using the SPRINT decision tree algorithm","authors":"T. Khoshgoftaar, Naeem Seliya","doi":"10.1109/TAI.2002.1180826","DOIUrl":null,"url":null,"abstract":"Predicting the quality of system modules prior to software testing and operations can benefit the software development team. Such a timely reliability estimation can be used to direct cost-effective quality improvement efforts to the high-risk modules. Tree-based software quality classification models based on software metrics are used to predict whether a software module is fault-prone or not fault-prone. They are white box quality estimation models with good accuracy, and are simple and easy to interpret. This paper presents an in-depth study of calibrating classification trees for software quality estimation using the SPRINT decision tree algorithm. Many classification algorithms have memory limitations including the requirement that data sets be memory resident. SPRINT removes all of these limitations and provides a fast and scalable analysis. It is an extension of a commonly used decision tree algorithm, CART, and provides a unique tree-pruning technique based on the minimum description length (MDL) principle. Combining the MDL pruning technique and the modified classification algorithm, SPRINT yields classification trees with useful prediction accuracy. The case study used comprises of software metrics and fault data collected over four releases from a very large telecommunications system. It is observed that classification trees built by SPRINT are more balanced and demonstrate better stability in comparison to those built by CART.","PeriodicalId":197064,"journal":{"name":"14th IEEE International Conference on Tools with Artificial Intelligence, 2002. (ICTAI 2002). Proceedings.","volume":"381 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"84","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"14th IEEE International Conference on Tools with Artificial Intelligence, 2002. (ICTAI 2002). Proceedings.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TAI.2002.1180826","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 84

Abstract

Predicting the quality of system modules prior to software testing and operations can benefit the software development team. Such a timely reliability estimation can be used to direct cost-effective quality improvement efforts to the high-risk modules. Tree-based software quality classification models based on software metrics are used to predict whether a software module is fault-prone or not fault-prone. They are white box quality estimation models with good accuracy, and are simple and easy to interpret. This paper presents an in-depth study of calibrating classification trees for software quality estimation using the SPRINT decision tree algorithm. Many classification algorithms have memory limitations including the requirement that data sets be memory resident. SPRINT removes all of these limitations and provides a fast and scalable analysis. It is an extension of a commonly used decision tree algorithm, CART, and provides a unique tree-pruning technique based on the minimum description length (MDL) principle. Combining the MDL pruning technique and the modified classification algorithm, SPRINT yields classification trees with useful prediction accuracy. The case study used comprises of software metrics and fault data collected over four releases from a very large telecommunications system. It is observed that classification trees built by SPRINT are more balanced and demonstrate better stability in comparison to those built by CART.

查看原文本刊更多论文

使用SPRINT决策树算法进行软件质量分类建模

在软件测试和操作之前预测系统模块的质量可以使软件开发团队受益。这种及时的可靠性评估可以用于指导对高风险模块进行经济有效的质量改进工作。基于软件度量的树状软件质量分类模型用于预测软件模块是否容易出错。它们是白盒质量的估计模型，具有良好的准确性，并且简单易于解释。本文对基于SPRINT决策树算法的软件质量评估分类树的标定问题进行了深入研究。许多分类算法都有内存限制，包括要求数据集驻留在内存中。SPRINT消除了所有这些限制，并提供了快速和可伸缩的分析。它是一种常用的决策树算法CART的扩展，并提供了一种基于最小描述长度(MDL)原则的独特的树修剪技术。结合MDL剪枝技术和改进的分类算法，SPRINT生成了具有有效预测精度的分类树。所使用的案例研究包括从一个非常大的电信系统的四个版本中收集的软件度量和故障数据。结果表明，与CART构建的分类树相比，SPRINT构建的分类树更加平衡，稳定性更好。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

14th IEEE International Conference on Tools with Artificial Intelligence, 2002. (ICTAI 2002). Proceedings.

自引率

0.00%

发文量