Towards a Theoretical Model for Software Growth

I. Herraiz, Jesus M. Gonzalez-Barahona, G. Robles
{"title":"Towards a Theoretical Model for Software Growth","authors":"I. Herraiz, Jesus M. Gonzalez-Barahona, G. Robles","doi":"10.1109/MSR.2007.31","DOIUrl":null,"url":null,"abstract":"Software growth (and more broadly, software evolution) is usually considered in terms of size or complexity of source code. However in different studies, usually different metrics are used, which make it difficult to compare approaches and results. In addition, not all metrics are equally easy to calculate for a given source code, which leads to the question of which one is the easiest to calculate without losing too much information. To address both issues, in this paper present a comprehensive study, based on the analysis of about 700,000 C source code files, calculating several size and complexity metrics for all of them. For this sample, we have found double Pareto statistical distributions for all metrics considered, and a high correlation between any two of them. This would imply that any model addressing software growth should produce this Pareto distributions, and that analysis based on any of the considered metrics should show a similar pattern, provided the sample of files considered is large enough.","PeriodicalId":201749,"journal":{"name":"Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007)","volume":"119 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"81","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MSR.2007.31","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 81

Abstract

Software growth (and more broadly, software evolution) is usually considered in terms of size or complexity of source code. However in different studies, usually different metrics are used, which make it difficult to compare approaches and results. In addition, not all metrics are equally easy to calculate for a given source code, which leads to the question of which one is the easiest to calculate without losing too much information. To address both issues, in this paper present a comprehensive study, based on the analysis of about 700,000 C source code files, calculating several size and complexity metrics for all of them. For this sample, we have found double Pareto statistical distributions for all metrics considered, and a high correlation between any two of them. This would imply that any model addressing software growth should produce this Pareto distributions, and that analysis based on any of the considered metrics should show a similar pattern, provided the sample of files considered is large enough.
软件成长的理论模型
软件增长(更广泛地说,软件进化)通常是根据源代码的大小或复杂性来考虑的。然而,在不同的研究中,通常使用不同的指标,这使得比较方法和结果变得困难。此外,对于给定的源代码,并不是所有的指标都同样容易计算,这就导致了哪个指标最容易计算而不会丢失太多信息的问题。为了解决这两个问题,本文提出了一项全面的研究,基于对大约700,000个C源代码文件的分析,计算了所有这些文件的几个大小和复杂性指标。对于这个样本,我们发现所有考虑的指标都有双重帕累托统计分布,并且其中任何两个指标之间都有很高的相关性。这意味着任何处理软件增长的模型都应该产生这种帕累托分布,并且基于任何考虑的度量的分析应该显示类似的模式,只要考虑的文件样本足够大。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信