Mining Experts from Source Code Analysis: An Empirical Evaluation

J. Softw. Eng. Res. Dev. Pub Date : 2021-02-08 DOI:10.5753/jserd.2021.548

Johnatan Oliveira, Markos Viggiato, Denis Pinheiro, E. Morales

{"title":"Mining Experts from Source Code Analysis: An Empirical Evaluation","authors":"Johnatan Oliveira, Markos Viggiato, Denis Pinheiro, E. Morales","doi":"10.5753/jserd.2021.548","DOIUrl":null,"url":null,"abstract":"Modern software development increasingly depends on third-party libraries to boost productivity and quality. This development is complex and requires specialists with knowledge in several technologies, such as the nowadays libraries. Such complexity turns it extremely challenging to deliver quality software, given the pressure. For this purpose, it is necessary to identify and hire qualified developers, to obtain a good team, both in open source and proprietary systems. For these reasons, enterprise and open source projects try to build teams composed of highly skilled developers in specific libraries. However, their identification may not be trivial. Despite this fact, we still lack procedures to assess developers skills in widely popular libraries. In this paper, we first argue that source code activities can identify software developers’ hard skills, such as library expertise. We then evaluate a mining-based strategy to reduce the search space to identify library experts. To achieve our goal, we selected the 9 most popular Java libraries and 6 libraries for microservices (i.e., 15 libraries in total). We assessed the skills of more than 1.5 million developers in these libraries by analyzing their commits in more than 17 K Java projects on GitHub. We evaluated the results by applying two surveys with 158 developers. First, with 137 library expert candidates, they observed 63% precision for popular Java libraries’ used strategy. Second, we observe a precision of at least 71% for 21 library experts in microservices. These low precision values suggest space for further improvements in the evaluated strategy.","PeriodicalId":189472,"journal":{"name":"J. Softw. Eng. Res. Dev.","volume":"36 6","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"J. Softw. Eng. Res. Dev.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5753/jserd.2021.548","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Modern software development increasingly depends on third-party libraries to boost productivity and quality. This development is complex and requires specialists with knowledge in several technologies, such as the nowadays libraries. Such complexity turns it extremely challenging to deliver quality software, given the pressure. For this purpose, it is necessary to identify and hire qualified developers, to obtain a good team, both in open source and proprietary systems. For these reasons, enterprise and open source projects try to build teams composed of highly skilled developers in specific libraries. However, their identification may not be trivial. Despite this fact, we still lack procedures to assess developers skills in widely popular libraries. In this paper, we first argue that source code activities can identify software developers’ hard skills, such as library expertise. We then evaluate a mining-based strategy to reduce the search space to identify library experts. To achieve our goal, we selected the 9 most popular Java libraries and 6 libraries for microservices (i.e., 15 libraries in total). We assessed the skills of more than 1.5 million developers in these libraries by analyzing their commits in more than 17 K Java projects on GitHub. We evaluated the results by applying two surveys with 158 developers. First, with 137 library expert candidates, they observed 63% precision for popular Java libraries’ used strategy. Second, we observe a precision of at least 71% for 21 library experts in microservices. These low precision values suggest space for further improvements in the evaluated strategy.

查看原文本刊更多论文

从源代码分析挖掘专家:经验评价

现代软件开发越来越依赖于第三方库来提高生产力和质量。这种发展是复杂的，需要具有多种技术知识的专家，比如现在的图书馆。在这种压力下，这种复杂性使得交付高质量的软件变得极具挑战性。为此目的，有必要确定并雇用合格的开发人员，以获得一个好的团队，无论是在开放源码还是专有系统中。由于这些原因，企业和开源项目试图在特定的库中构建由高技能开发人员组成的团队。然而，他们的识别可能不是微不足道的。尽管如此，在广泛流行的库中，我们仍然缺乏评估开发人员技能的过程。在本文中，我们首先论证了源代码活动可以识别软件开发人员的硬技能，例如库专业知识。然后，我们评估了一种基于挖掘的策略，以减少搜索空间，以识别图书馆专家。为了实现我们的目标，我们选择了9个最流行的Java库和6个微服务库(即总共15个库)。我们通过分析他们在GitHub上超过17个K Java项目中的提交，评估了这些库中超过150万开发人员的技能。我们通过对158名开发者进行两次调查来评估结果。首先，在137个库专家候选中，他们观察到流行Java库使用策略的准确率为63%。其次，我们观察到21位图书馆微服务专家的准确率至少为71%。这些低精度值表明在评估策略中有进一步改进的空间。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

J. Softw. Eng. Res. Dev.

自引率

0.00%

发文量