Software quality classification modeling using the SPRINT decision tree algorithm

T. Khoshgoftaar, Naeem Seliya
{"title":"Software quality classification modeling using the SPRINT decision tree algorithm","authors":"T. Khoshgoftaar, Naeem Seliya","doi":"10.1109/TAI.2002.1180826","DOIUrl":null,"url":null,"abstract":"Predicting the quality of system modules prior to software testing and operations can benefit the software development team. Such a timely reliability estimation can be used to direct cost-effective quality improvement efforts to the high-risk modules. Tree-based software quality classification models based on software metrics are used to predict whether a software module is fault-prone or not fault-prone. They are white box quality estimation models with good accuracy, and are simple and easy to interpret. This paper presents an in-depth study of calibrating classification trees for software quality estimation using the SPRINT decision tree algorithm. Many classification algorithms have memory limitations including the requirement that data sets be memory resident. SPRINT removes all of these limitations and provides a fast and scalable analysis. It is an extension of a commonly used decision tree algorithm, CART, and provides a unique tree-pruning technique based on the minimum description length (MDL) principle. Combining the MDL pruning technique and the modified classification algorithm, SPRINT yields classification trees with useful prediction accuracy. The case study used comprises of software metrics and fault data collected over four releases from a very large telecommunications system. It is observed that classification trees built by SPRINT are more balanced and demonstrate better stability in comparison to those built by CART.","PeriodicalId":197064,"journal":{"name":"14th IEEE International Conference on Tools with Artificial Intelligence, 2002. (ICTAI 2002). Proceedings.","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2002-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"84","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"14th IEEE International Conference on Tools with Artificial Intelligence, 2002. (ICTAI 2002). Proceedings.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TAI.2002.1180826","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 84

Abstract

Predicting the quality of system modules prior to software testing and operations can benefit the software development team. Such a timely reliability estimation can be used to direct cost-effective quality improvement efforts to the high-risk modules. Tree-based software quality classification models based on software metrics are used to predict whether a software module is fault-prone or not fault-prone. They are white box quality estimation models with good accuracy, and are simple and easy to interpret. This paper presents an in-depth study of calibrating classification trees for software quality estimation using the SPRINT decision tree algorithm. Many classification algorithms have memory limitations including the requirement that data sets be memory resident. SPRINT removes all of these limitations and provides a fast and scalable analysis. It is an extension of a commonly used decision tree algorithm, CART, and provides a unique tree-pruning technique based on the minimum description length (MDL) principle. Combining the MDL pruning technique and the modified classification algorithm, SPRINT yields classification trees with useful prediction accuracy. The case study used comprises of software metrics and fault data collected over four releases from a very large telecommunications system. It is observed that classification trees built by SPRINT are more balanced and demonstrate better stability in comparison to those built by CART.
使用SPRINT决策树算法进行软件质量分类建模
在软件测试和操作之前预测系统模块的质量可以使软件开发团队受益。这种及时的可靠性评估可以用于指导对高风险模块进行经济有效的质量改进工作。基于软件度量的树状软件质量分类模型用于预测软件模块是否容易出错。它们是白盒质量的估计模型,具有良好的准确性,并且简单易于解释。本文对基于SPRINT决策树算法的软件质量评估分类树的标定问题进行了深入研究。许多分类算法都有内存限制,包括要求数据集驻留在内存中。SPRINT消除了所有这些限制,并提供了快速和可伸缩的分析。它是一种常用的决策树算法CART的扩展,并提供了一种基于最小描述长度(MDL)原则的独特的树修剪技术。结合MDL剪枝技术和改进的分类算法,SPRINT生成了具有有效预测精度的分类树。所使用的案例研究包括从一个非常大的电信系统的四个版本中收集的软件度量和故障数据。结果表明,与CART构建的分类树相比,SPRINT构建的分类树更加平衡,稳定性更好。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信