Detection of software modules with high debug code churn in a very large legacy system

Proceedings of ISSRE '96: 7th International Symposium on Software Reliability Engineering Pub Date : 1996-10-30 DOI:10.1109/ISSRE.1996.558896

T. Khoshgoftaar, E. B. Allen, N. Goel, A. Nandi, John McMullan

{"title":"Detection of software modules with high debug code churn in a very large legacy system","authors":"T. Khoshgoftaar, E. B. Allen, N. Goel, A. Nandi, John McMullan","doi":"10.1109/ISSRE.1996.558896","DOIUrl":null,"url":null,"abstract":"Society has become so dependent on reliable telecommunications, that failures can risk loss of emergency service, business disruptions, or isolation from friends. Consequently, telecommunications software is required to have high reliability. Many previous studies define the classification fault prone in terms of fault counts. This study defines fault prone as exceeding a threshold of debug code churn, defined as the number of lines added or changed due to bug fixes. Previous studies have characterized reuse history with simple categories. This study quantified new functionality with lines of code. The paper analyzes two consecutive releases of a large legacy software system for telecommunications. We applied discriminant analysis to identify fault prone modules based on 16 static software product metrics and the amount of code changed during development. Modules from one release were used as a fit data set and modules from the subsequent release were used as a test data set. In contrast, comparable prior studies of legacy systems split the data to simulate two releases. We validated the model with a realistic simulation of utilization of the fitted model with the test data set. Model results could be used to give extra attention to fault prone modules and thus, reduce the risk of unexpected problems.","PeriodicalId":441362,"journal":{"name":"Proceedings of ISSRE '96: 7th International Symposium on Software Reliability Engineering","volume":"53 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1996-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"96","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of ISSRE '96: 7th International Symposium on Software Reliability Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISSRE.1996.558896","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 96

Abstract

Society has become so dependent on reliable telecommunications, that failures can risk loss of emergency service, business disruptions, or isolation from friends. Consequently, telecommunications software is required to have high reliability. Many previous studies define the classification fault prone in terms of fault counts. This study defines fault prone as exceeding a threshold of debug code churn, defined as the number of lines added or changed due to bug fixes. Previous studies have characterized reuse history with simple categories. This study quantified new functionality with lines of code. The paper analyzes two consecutive releases of a large legacy software system for telecommunications. We applied discriminant analysis to identify fault prone modules based on 16 static software product metrics and the amount of code changed during development. Modules from one release were used as a fit data set and modules from the subsequent release were used as a test data set. In contrast, comparable prior studies of legacy systems split the data to simulate two releases. We validated the model with a realistic simulation of utilization of the fitted model with the test data set. Model results could be used to give extra attention to fault prone modules and thus, reduce the risk of unexpected problems.

查看原文本刊更多论文

在非常大的遗留系统中检测具有高调试代码混乱的软件模块

社会已经变得如此依赖可靠的电信，因此一旦发生故障，就可能面临失去应急服务、业务中断或与朋友隔绝的风险。因此，要求电信软件具有较高的可靠性。以往的许多研究都是从故障数的角度来定义易发故障的分类。本研究将易出错定义为超过调试代码混乱的阈值，定义为由于bug修复而增加或更改的行数。以前的研究用简单的分类来描述重用历史。这项研究用代码行量化了新功能。本文分析了一个大型电信遗留软件系统的两个连续版本。我们基于16个静态软件产品指标和开发过程中更改的代码量，应用判别分析来识别易故障模块。来自一个版本的模块被用作拟合数据集，来自后续版本的模块被用作测试数据集。相比之下，以前对遗留系统的可比研究将数据分开来模拟两个版本。我们利用拟合模型与测试数据集的实际模拟验证了该模型。模型结果可用于对容易发生故障的模块给予额外的关注，从而降低发生意外问题的风险。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of ISSRE '96: 7th International Symposium on Software Reliability Engineering

自引率

0.00%

发文量