Studying the impact of risk assessment analytics on risk awareness and code review performance

IF 3.5 2区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Empirical Software Engineering Pub Date : 2024-02-17 DOI:10.1007/s10664-024-10443-x

{"title":"Studying the impact of risk assessment analytics on risk awareness and code review performance","authors":"","doi":"10.1007/s10664-024-10443-x","DOIUrl":null,"url":null,"abstract":"<h3>Abstract</h3> While code review is a critical component of modern software quality assurance, defects can still slip through the review process undetected. Previous research suggests that the main reason for this is a lack of reviewer awareness about the likelihood of defects in proposed changes; even experienced developers may struggle to evaluate the potential risks. If a change’s riskiness is underestimated, it may not receive adequate attention during review, potentially leading to defects being introduced into the codebase. In this paper, we investigate how risk assessment analytics can influence the level of awareness among developers regarding the potential risks associated with code changes; we also study how effective and efficient reviewers are at detecting defects during code review with the use of such analytics. We conduct a controlled experiment using Gherald, a risk assessment prototype tool that analyzes the riskiness of change sets based on historical data. Following a between-subjects experimental design, we assign participants to the treatment (i.e., with access to Gherald) or control group. All participants are asked to perform risk assessment and code review tasks. Through our experiment with 48 participants, we find that the use of Gherald is associated with statistically significant improvements (one-tailed, unpaired Mann-Whitney U test, \\(\\alpha \\) = 0.05) in developer awareness of riskiness of code changes and code review effectiveness. Moreover, participants in the treatment group tend to identify the known defects more quickly than those in the control group; however, the difference between the two groups is not statistically significant. Our results lead us to conclude that the adoption of a risk assessment tool has a positive impact on code review practices, which provides valuable insights for practitioners seeking to enhance their code review process and highlights the importance for further research to explore more effective and practical risk assessment approaches.","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":"33 1","pages":""},"PeriodicalIF":3.5000,"publicationDate":"2024-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Empirical Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10664-024-10443-x","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

While code review is a critical component of modern software quality assurance, defects can still slip through the review process undetected. Previous research suggests that the main reason for this is a lack of reviewer awareness about the likelihood of defects in proposed changes; even experienced developers may struggle to evaluate the potential risks. If a change’s riskiness is underestimated, it may not receive adequate attention during review, potentially leading to defects being introduced into the codebase. In this paper, we investigate how risk assessment analytics can influence the level of awareness among developers regarding the potential risks associated with code changes; we also study how effective and efficient reviewers are at detecting defects during code review with the use of such analytics. We conduct a controlled experiment using Gherald, a risk assessment prototype tool that analyzes the riskiness of change sets based on historical data. Following a between-subjects experimental design, we assign participants to the treatment (i.e., with access to Gherald) or control group. All participants are asked to perform risk assessment and code review tasks. Through our experiment with 48 participants, we find that the use of Gherald is associated with statistically significant improvements (one-tailed, unpaired Mann-Whitney U test, \(\alpha \) = 0.05) in developer awareness of riskiness of code changes and code review effectiveness. Moreover, participants in the treatment group tend to identify the known defects more quickly than those in the control group; however, the difference between the two groups is not statistically significant. Our results lead us to conclude that the adoption of a risk assessment tool has a positive impact on code review practices, which provides valuable insights for practitioners seeking to enhance their code review process and highlights the importance for further research to explore more effective and practical risk assessment approaches.

查看原文本刊更多论文

研究风险评估分析对风险意识和代码审查绩效的影响

摘要虽然代码审查是现代软件质量保证的重要组成部分，但缺陷仍有可能通过审查过程而不被发现。以往的研究表明，造成这种情况的主要原因是审查员对拟议变更中出现缺陷的可能性缺乏认识；即使是经验丰富的开发人员也可能难以评估潜在的风险。如果低估了变更的风险性，它就可能在审核过程中得不到足够的重视，从而可能导致缺陷被引入代码库。在本文中，我们将研究风险评估分析如何影响开发人员对代码变更潜在风险的认识水平；我们还将研究在使用此类分析的情况下，审核人员在代码审核过程中发现缺陷的效果和效率。我们使用风险评估原型工具 Gherald 进行了一项对照实验，该工具可根据历史数据对变更集的风险性进行分析。按照主体间实验设计，我们将参与者分配到处理组（即可以访问 Gherald）或对照组。所有参与者都被要求执行风险评估和代码审查任务。通过对 48 名参与者的实验，我们发现 Gherald 的使用与开发人员对代码变更风险意识和代码审查有效性的统计意义上的显著提高相关（单尾、非对称 Mann-Whitney U 检验，\(\α \) = 0.05）。此外，治疗组的参与者往往比对照组的参与者更快地识别出已知缺陷；但是，两组之间的差异在统计学上并不显著。我们的研究结果使我们得出结论，采用风险评估工具对代码审查实践有积极影响，这为寻求加强代码审查流程的从业人员提供了宝贵的见解，并强调了进一步研究探索更有效、更实用的风险评估方法的重要性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Empirical Software Engineering 工程技术-计算机：软件工程

CiteScore

8.50

自引率

12.20%

发文量

169

审稿时长

>12 weeks

期刊介绍： Empirical Software Engineering provides a forum for applied software engineering research with a strong empirical component, and a venue for publishing empirical results relevant to both researchers and practitioners. Empirical studies presented here usually involve the collection and analysis of data and experience that can be used to characterize, evaluate and reveal relationships between software development deliverables, practices, and technologies. Over time, it is expected that such empirical results will form a body of knowledge leading to widely accepted and well-formed theories. The journal also offers industrial experience reports detailing the application of software technologies - processes, methods, or tools - and their effectiveness in industrial settings. Empirical Software Engineering promotes the publication of industry-relevant research, to address the significant gap between research and practice.