Mining Experts from Source Code Analysis: An Empirical Evaluation

Johnatan Oliveira, Markos Viggiato, Denis Pinheiro, E. Morales
{"title":"Mining Experts from Source Code Analysis: An Empirical Evaluation","authors":"Johnatan Oliveira, Markos Viggiato, Denis Pinheiro, E. Morales","doi":"10.5753/jserd.2021.548","DOIUrl":null,"url":null,"abstract":"Modern software development increasingly depends on third-­party libraries to boost productivity and quality. This development is complex and requires specialists with knowledge in several technologies, such as the nowadays libraries. Such complexity turns it extremely challenging to deliver quality software, given the pressure. For this purpose, it is necessary to identify and hire qualified developers, to obtain a good team, both in open source and proprietary systems. For these reasons, enterprise and open source projects try to build teams composed of highly skilled developers in specific libraries. However, their identification may not be trivial. Despite this fact, we still lack procedures to assess developers skills in widely popular libraries. In this paper, we first argue that source code activities can identify software developers’ hard skills, such as library expertise. We then evaluate a mining­-based strategy to reduce the search space to identify library experts. To achieve our goal, we selected the 9 most popular Java libraries and 6 libraries for microservices (i.e., 15 libraries in total). We assessed the skills of more than 1.5 million developers in these libraries by analyzing their commits in more than 17 K Java projects on GitHub. We evaluated the results by applying two surveys with 158 developers. First, with 137 library expert candidates, they observed 63% precision for popular Java libraries’ used strategy. Second, we observe a precision of at least 71% for 21 library experts in microservices. These low precision values suggest space for further improvements in the evaluated strategy.","PeriodicalId":189472,"journal":{"name":"J. Softw. Eng. Res. Dev.","volume":"36 6","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"J. Softw. Eng. Res. Dev.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5753/jserd.2021.548","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Modern software development increasingly depends on third-­party libraries to boost productivity and quality. This development is complex and requires specialists with knowledge in several technologies, such as the nowadays libraries. Such complexity turns it extremely challenging to deliver quality software, given the pressure. For this purpose, it is necessary to identify and hire qualified developers, to obtain a good team, both in open source and proprietary systems. For these reasons, enterprise and open source projects try to build teams composed of highly skilled developers in specific libraries. However, their identification may not be trivial. Despite this fact, we still lack procedures to assess developers skills in widely popular libraries. In this paper, we first argue that source code activities can identify software developers’ hard skills, such as library expertise. We then evaluate a mining­-based strategy to reduce the search space to identify library experts. To achieve our goal, we selected the 9 most popular Java libraries and 6 libraries for microservices (i.e., 15 libraries in total). We assessed the skills of more than 1.5 million developers in these libraries by analyzing their commits in more than 17 K Java projects on GitHub. We evaluated the results by applying two surveys with 158 developers. First, with 137 library expert candidates, they observed 63% precision for popular Java libraries’ used strategy. Second, we observe a precision of at least 71% for 21 library experts in microservices. These low precision values suggest space for further improvements in the evaluated strategy.
从源代码分析挖掘专家:经验评价
现代软件开发越来越依赖于第三方库来提高生产力和质量。这种发展是复杂的,需要具有多种技术知识的专家,比如现在的图书馆。在这种压力下,这种复杂性使得交付高质量的软件变得极具挑战性。为此目的,有必要确定并雇用合格的开发人员,以获得一个好的团队,无论是在开放源码还是专有系统中。由于这些原因,企业和开源项目试图在特定的库中构建由高技能开发人员组成的团队。然而,他们的识别可能不是微不足道的。尽管如此,在广泛流行的库中,我们仍然缺乏评估开发人员技能的过程。在本文中,我们首先论证了源代码活动可以识别软件开发人员的硬技能,例如库专业知识。然后,我们评估了一种基于挖掘的策略,以减少搜索空间,以识别图书馆专家。为了实现我们的目标,我们选择了9个最流行的Java库和6个微服务库(即总共15个库)。我们通过分析他们在GitHub上超过17个K Java项目中的提交,评估了这些库中超过150万开发人员的技能。我们通过对158名开发者进行两次调查来评估结果。首先,在137个库专家候选中,他们观察到流行Java库使用策略的准确率为63%。其次,我们观察到21位图书馆微服务专家的准确率至少为71%。这些低精度值表明在评估策略中有进一步改进的空间。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信