冷启动软件分析

Jin Guo, Mona Rahimi, J. Cleland-Huang, A. Rasin, J. Hayes, Michael Vierhauser
{"title":"冷启动软件分析","authors":"Jin Guo, Mona Rahimi, J. Cleland-Huang, A. Rasin, J. Hayes, Michael Vierhauser","doi":"10.1145/2901739.2901740","DOIUrl":null,"url":null,"abstract":"Software project artifacts such as source code, requirements, and change logs represent a gold-mine of actionable information. As a result, software analytic solutions have been developed to mine repositories and answer questions such as \"who is the expert?,'' \"which classes are fault prone?,'' or even \"who are the domain experts for these fault-prone classes?'' Analytics often require training and configuring in order to maximize performance within the context of each project. A cold-start problem exists when a function is applied within a project context without first configuring the analytic functions on project-specific data. This scenario exists because of the non-trivial effort necessary to instrument a project environment with candidate tools and algorithms and to empirically evaluate alternate configurations. We address the cold-start problem by comparatively evaluating `best-of-breed' and `profile-driven' solutions, both of which reuse known configurations in new project contexts. We describe and evaluate our approach against 20 project datasets for the three analytic areas of artifact connectivity, fault-prediction, and finding the expert, and show that the best-of-breed approach outperformed the profile-driven approach in all three areas; however, while it delivered acceptable results for artifact connectivity and find the expert, both techniques underperformed for cold-start fault prediction.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"8 1","pages":"142-153"},"PeriodicalIF":0.0000,"publicationDate":"2016-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"25","resultStr":"{\"title\":\"Cold-Start Software Analytics\",\"authors\":\"Jin Guo, Mona Rahimi, J. Cleland-Huang, A. Rasin, J. Hayes, Michael Vierhauser\",\"doi\":\"10.1145/2901739.2901740\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Software project artifacts such as source code, requirements, and change logs represent a gold-mine of actionable information. As a result, software analytic solutions have been developed to mine repositories and answer questions such as \\\"who is the expert?,'' \\\"which classes are fault prone?,'' or even \\\"who are the domain experts for these fault-prone classes?'' Analytics often require training and configuring in order to maximize performance within the context of each project. A cold-start problem exists when a function is applied within a project context without first configuring the analytic functions on project-specific data. This scenario exists because of the non-trivial effort necessary to instrument a project environment with candidate tools and algorithms and to empirically evaluate alternate configurations. We address the cold-start problem by comparatively evaluating `best-of-breed' and `profile-driven' solutions, both of which reuse known configurations in new project contexts. We describe and evaluate our approach against 20 project datasets for the three analytic areas of artifact connectivity, fault-prediction, and finding the expert, and show that the best-of-breed approach outperformed the profile-driven approach in all three areas; however, while it delivered acceptable results for artifact connectivity and find the expert, both techniques underperformed for cold-start fault prediction.\",\"PeriodicalId\":6621,\"journal\":{\"name\":\"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)\",\"volume\":\"8 1\",\"pages\":\"142-153\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-05-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"25\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2901739.2901740\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2901739.2901740","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 25

摘要

软件项目工件,如源代码、需求和变更日志,代表了可操作信息的金矿。因此,开发了软件分析解决方案来挖掘存储库并回答诸如“谁是专家?”,“哪些类别容易发生故障?”,甚至“谁是这些容易出错类的领域专家?”“分析通常需要培训和配置,以便在每个项目的上下文中最大化性能。当在项目上下文中应用函数而没有首先在项目特定数据上配置分析函数时,就会存在冷启动问题。这种情况之所以存在,是因为使用候选工具和算法对项目环境进行仪表化以及经验地评估备选配置所必需的重要工作。我们通过比较评估“同类最佳”和“配置文件驱动”的解决方案来解决冷启动问题,这两种解决方案都在新的项目环境中重用已知的配置。我们针对工件连接性、故障预测和寻找专家这三个分析领域的20个项目数据集描述和评估了我们的方法,并表明在所有三个领域中,最佳的方法都优于概要驱动的方法;然而,尽管它为工件连接性和寻找专家提供了可接受的结果,但这两种技术在冷启动故障预测方面表现不佳。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Cold-Start Software Analytics
Software project artifacts such as source code, requirements, and change logs represent a gold-mine of actionable information. As a result, software analytic solutions have been developed to mine repositories and answer questions such as "who is the expert?,'' "which classes are fault prone?,'' or even "who are the domain experts for these fault-prone classes?'' Analytics often require training and configuring in order to maximize performance within the context of each project. A cold-start problem exists when a function is applied within a project context without first configuring the analytic functions on project-specific data. This scenario exists because of the non-trivial effort necessary to instrument a project environment with candidate tools and algorithms and to empirically evaluate alternate configurations. We address the cold-start problem by comparatively evaluating `best-of-breed' and `profile-driven' solutions, both of which reuse known configurations in new project contexts. We describe and evaluate our approach against 20 project datasets for the three analytic areas of artifact connectivity, fault-prediction, and finding the expert, and show that the best-of-breed approach outperformed the profile-driven approach in all three areas; however, while it delivered acceptable results for artifact connectivity and find the expert, both techniques underperformed for cold-start fault prediction.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信