在一个发布窗口内,源代码单元变更的频率是多少?

Joseph F. Shobe, Md Yasser Karim, Huzefa H. Kagdi
{"title":"在一个发布窗口内,源代码单元变更的频率是多少?","authors":"Joseph F. Shobe, Md Yasser Karim, Huzefa H. Kagdi","doi":"10.1145/2723742.2723759","DOIUrl":null,"url":null,"abstract":"To form a training set for a source-code change prediction model, e.g., using the association rule mining or machine learning techniques, commits from the source code history are needed. The traceability between releases and commits would facilitate a systematic choice of history in units of the project evolution scale (i.e., commits that constitute a software release). For example, the major release 25.0 in Chrome is mapped to the earliest revision 157687 and latest revision 165096 in the trunk. Using this traceability, an empirical study is reported on the frequency distribution of file changes for different release windows. In Chrome, the majority (50%) of the committed files change only once between a pair of consecutive releases. This trend is reversed after expanding the window size to at least 10. That is, the majority (50%) of the files change multiple times when commits constituting 10 or greater releases are considered. These results suggest that a training set of at least 10 releases is needed to provide a prediction coverage for majority of the files.","PeriodicalId":288030,"journal":{"name":"Proceedings of the 8th India Software Engineering Conference","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"How Often does a Source Code Unit Change within a Release Window?\",\"authors\":\"Joseph F. Shobe, Md Yasser Karim, Huzefa H. Kagdi\",\"doi\":\"10.1145/2723742.2723759\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"To form a training set for a source-code change prediction model, e.g., using the association rule mining or machine learning techniques, commits from the source code history are needed. The traceability between releases and commits would facilitate a systematic choice of history in units of the project evolution scale (i.e., commits that constitute a software release). For example, the major release 25.0 in Chrome is mapped to the earliest revision 157687 and latest revision 165096 in the trunk. Using this traceability, an empirical study is reported on the frequency distribution of file changes for different release windows. In Chrome, the majority (50%) of the committed files change only once between a pair of consecutive releases. This trend is reversed after expanding the window size to at least 10. That is, the majority (50%) of the files change multiple times when commits constituting 10 or greater releases are considered. These results suggest that a training set of at least 10 releases is needed to provide a prediction coverage for majority of the files.\",\"PeriodicalId\":288030,\"journal\":{\"name\":\"Proceedings of the 8th India Software Engineering Conference\",\"volume\":\"6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-02-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 8th India Software Engineering Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2723742.2723759\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 8th India Software Engineering Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2723742.2723759","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

为了形成源代码变化预测模型的训练集,例如,使用关联规则挖掘或机器学习技术,需要从源代码历史中提交。发布和提交之间的可追溯性将有助于在项目演进规模的单元中系统地选择历史(例如,构成软件发布的提交)。例如,Chrome的主版本25.0映射到trunk中最早的版本157687和最新的版本165096。使用这种可追溯性,对不同发布窗口的文件更改的频率分布进行了实证研究。在Chrome中,大多数(50%)提交的文件在一对连续发布之间只更改一次。在将窗口大小扩展到至少10之后,这种趋势就会逆转。也就是说,当考虑包含10个或更多版本的提交时,大多数(50%)文件会更改多次。这些结果表明,至少需要10个版本的训练集才能为大多数文件提供预测覆盖率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
How Often does a Source Code Unit Change within a Release Window?
To form a training set for a source-code change prediction model, e.g., using the association rule mining or machine learning techniques, commits from the source code history are needed. The traceability between releases and commits would facilitate a systematic choice of history in units of the project evolution scale (i.e., commits that constitute a software release). For example, the major release 25.0 in Chrome is mapped to the earliest revision 157687 and latest revision 165096 in the trunk. Using this traceability, an empirical study is reported on the frequency distribution of file changes for different release windows. In Chrome, the majority (50%) of the committed files change only once between a pair of consecutive releases. This trend is reversed after expanding the window size to at least 10. That is, the majority (50%) of the files change multiple times when commits constituting 10 or greater releases are considered. These results suggest that a training set of at least 10 releases is needed to provide a prediction coverage for majority of the files.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信