Two sides of the same coin: A study on developers' perception of defects

IF 1.7 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Journal of Software-Evolution and Process Pub Date : 2024-06-18 DOI:10.1002/smr.2699

Geanderson Santos, Igor Muzetti, Eduardo Figueiredo

{"title":"Two sides of the same coin: A study on developers' perception of defects","authors":"Geanderson Santos, Igor Muzetti, Eduardo Figueiredo","doi":"10.1002/smr.2699","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Software defect prediction is a subject of study involving the interplay of software engineering and machine learning. The current literature proposed numerous machine learning models to predict software defects from software data, such as commits and code metrics. Further, the most recent literature employs explainability techniques to understand why machine learning models made such predictions (i.e., predicting the likelihood of a defect). As a result, developers are expected to reason on the software features that may relate to defects in the source code. However, little is known about the developers' perception of these machine learning models and their explanations. To explore this issue, we focus on a survey with experienced developers to understand how they evaluate each quality attribute for the defect prediction. We chose the developers based on their contributions at GitHub, where they contributed to at least 10 repositories in the past 2 years. The results show that developers tend to evaluate code complexity as the most important quality attribute to avoid defects compared with the other target attributes such as source code size, coupling, and documentation. At the end, a thematic analysis reveals that developers evaluate testing the code as a relevant aspect not covered by the static software features. We conclude that, qualitatively, there exists a misalignment between developers' perceptions and the outputs of machine learning models. For instance, while machine learning models assign high importance to documentation, developers often overlook documentation and prioritize assessing the complexity of the code instead.</p>\n </div>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"36 10","pages":""},"PeriodicalIF":1.7000,"publicationDate":"2024-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Software-Evolution and Process","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/smr.2699","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

Software defect prediction is a subject of study involving the interplay of software engineering and machine learning. The current literature proposed numerous machine learning models to predict software defects from software data, such as commits and code metrics. Further, the most recent literature employs explainability techniques to understand why machine learning models made such predictions (i.e., predicting the likelihood of a defect). As a result, developers are expected to reason on the software features that may relate to defects in the source code. However, little is known about the developers' perception of these machine learning models and their explanations. To explore this issue, we focus on a survey with experienced developers to understand how they evaluate each quality attribute for the defect prediction. We chose the developers based on their contributions at GitHub, where they contributed to at least 10 repositories in the past 2 years. The results show that developers tend to evaluate code complexity as the most important quality attribute to avoid defects compared with the other target attributes such as source code size, coupling, and documentation. At the end, a thematic analysis reveals that developers evaluate testing the code as a relevant aspect not covered by the static software features. We conclude that, qualitatively, there exists a misalignment between developers' perceptions and the outputs of machine learning models. For instance, while machine learning models assign high importance to documentation, developers often overlook documentation and prioritize assessing the complexity of the code instead.

Abstract Image

查看原文本刊更多论文

一枚硬币的两面：关于开发人员对缺陷的看法的研究

软件缺陷预测是一个涉及软件工程和机器学习相互作用的研究课题。目前的文献提出了许多机器学习模型，用于从提交和代码度量等软件数据中预测软件缺陷。此外，最新的文献采用了可解释性技术来理解机器学习模型做出此类预测（即预测缺陷发生的可能性）的原因。因此，开发人员有望推理出可能与源代码中缺陷相关的软件特征。然而，人们对开发人员对这些机器学习模型的看法及其解释知之甚少。为了探讨这个问题，我们重点对经验丰富的开发人员进行了调查，以了解他们如何评估缺陷预测的各个质量属性。我们选择开发人员的依据是他们在 GitHub 上的贡献，他们在过去两年中至少为 10 个软件源做出了贡献。结果显示，与源代码大小、耦合度和文档等其他目标属性相比，开发人员倾向于将代码复杂性作为避免缺陷的最重要质量属性。最后，专题分析表明，开发人员认为测试代码是静态软件特性未涵盖的一个相关方面。我们得出的结论是，从质量上看，开发人员的看法与机器学习模型的输出之间存在偏差。例如，虽然机器学习模型赋予了文档很高的重要性，但开发人员往往会忽略文档，而是优先评估代码的复杂性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Software-Evolution and Process COMPUTER SCIENCE, SOFTWARE ENGINEERING-

自引率

10.00%

发文量

109