Building software quality classification trees: approach, experimentation, evaluation

R. Takahashi, Y. Muraoka, Yukihiro Nakamura
{"title":"Building software quality classification trees: approach, experimentation, evaluation","authors":"R. Takahashi, Y. Muraoka, Yukihiro Nakamura","doi":"10.1109/ISSRE.1997.630869","DOIUrl":null,"url":null,"abstract":"A methodology for generating an optimum software quality classification tree using software complexity metrics to discriminate between high-quality modules and low-quality modules is proposed. The process of tree generation is an application of the AIC (Akaike Information Criterion) procedures to the binomial distribution. AIC procedures are based on maximum likelihood estimation and the least number of complexity metrics. It is an improvement of the software quality classification tree generation method proposed by Porter and Selby (1990) from the viewpoint that the complexity metrics are minimized. The problems of their method are that the software quality prediction model is unstable because it reflects observational errors in real data too much and there is no objective criterion for determining whether the discrimination is appropriate or not at a deep nesting level of the classification tree when the number of sample modules gets smaller. To solve these problems a new metric is introduced and its validity is theoretically and experimentally verified. In our examples, complexity metrics written in C language, such as lines of source code, Halstead's (1977) software science, McCabe's (976) cyclomatic number, Henry and Kafura's (1981) fan-in/out and Howatt and Baker's (1989) scope number, are investigated. Our experiments with a medium-sized piece of software (85 thousand lines of source code; 562 samples) show that the software quality classification tree generated by our new metric identifies the target class of the observed modules more efficiently using the minimum number of complexity metrics without any significant decrease of the correct classification ratio (76%->72%) than the conventional classification tree.","PeriodicalId":170184,"journal":{"name":"Proceedings The Eighth International Symposium on Software Reliability Engineering","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1997-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"30","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings The Eighth International Symposium on Software Reliability Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISSRE.1997.630869","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 30

Abstract

A methodology for generating an optimum software quality classification tree using software complexity metrics to discriminate between high-quality modules and low-quality modules is proposed. The process of tree generation is an application of the AIC (Akaike Information Criterion) procedures to the binomial distribution. AIC procedures are based on maximum likelihood estimation and the least number of complexity metrics. It is an improvement of the software quality classification tree generation method proposed by Porter and Selby (1990) from the viewpoint that the complexity metrics are minimized. The problems of their method are that the software quality prediction model is unstable because it reflects observational errors in real data too much and there is no objective criterion for determining whether the discrimination is appropriate or not at a deep nesting level of the classification tree when the number of sample modules gets smaller. To solve these problems a new metric is introduced and its validity is theoretically and experimentally verified. In our examples, complexity metrics written in C language, such as lines of source code, Halstead's (1977) software science, McCabe's (976) cyclomatic number, Henry and Kafura's (1981) fan-in/out and Howatt and Baker's (1989) scope number, are investigated. Our experiments with a medium-sized piece of software (85 thousand lines of source code; 562 samples) show that the software quality classification tree generated by our new metric identifies the target class of the observed modules more efficiently using the minimum number of complexity metrics without any significant decrease of the correct classification ratio (76%->72%) than the conventional classification tree.
构建软件质量分类树:方法、实验、评估
提出了一种利用软件复杂性度量来区分高质量模块和低质量模块的最佳软件质量分类树的方法。树的生成过程是将赤池信息准则(AIC)应用于二项分布。AIC程序基于最大似然估计和最少数量的复杂性度量。它是对Porter和Selby(1990)提出的软件质量分类树生成方法的改进,从最小化复杂性度量的角度出发。他们的方法存在的问题是,软件质量预测模型过多地反映了实际数据中的观测误差,不稳定;当样本模块数量变小时,在分类树的深层嵌套层次上,没有客观的标准来判断判断是否合适。为了解决这些问题,提出了一种新的度量方法,并对其有效性进行了理论和实验验证。在我们的示例中,研究了用C语言编写的复杂性度量,例如源代码行数、Halstead的(1977)软件科学、McCabe的(1976)圈数、Henry和Kafura的(1981)扇入/扇出以及Howatt和Baker的(1989)作用域数。我们对一个中等大小的软件(85000行源代码;562个样本)表明,由我们的新度量生成的软件质量分类树使用最小数量的复杂性度量更有效地识别观察模块的目标类,而没有显著降低正确分类率(76%->72%)比传统分类树。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信