Prioritizing Test Gaps by Risk in Industrial Practice: An Automated Approach and Multimethod Study

IF 6.5 1区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING

IEEE Transactions on Software Engineering Pub Date : 2025-03-28 DOI:10.1109/TSE.2025.3556248

Roman Haas;Michael Sailer;Mitchell Joblin;Elmar Juergens;Sven Apel

{"title":"Prioritizing Test Gaps by Risk in Industrial Practice: An Automated Approach and Multimethod Study","authors":"Roman Haas;Michael Sailer;Mitchell Joblin;Elmar Juergens;Sven Apel","doi":"10.1109/TSE.2025.3556248","DOIUrl":null,"url":null,"abstract":"<italic>Context. Untested code changes, called <italic>test gaps, pose a significant risk for software projects. Since test gaps increase the probability of defects, managing test gaps and their individual risk is important, especially for rapidly changing software systems. <italic>Objective. This study aims at gaining an understanding of test gaps in industrial practice establishing criteria for precise prioritization of test gaps by their risk, informing practitioners that need to manage, review, and act on larger sets of test gaps. <italic>Method. We propose an automated approach for prioritizing test gaps based on key risk criteria. By means of an analysis of 31 historical test gap reviews from 8 industrial software systems of our industrial partners Munich Re and LV 1871, and by conducting semi-structured interviews with the 6 quality engineers that authored the historical test gap reviews, we validate the transferability of the identified risk criteria, such as code criticality and complexity metrics. <italic>Results. Our automated approach exhibits a ranking performance equivalent to expert assessments, in that test gaps labelled as risky in historical test gap reviews are prioritized correctly, on average, on the 30th percentile. In some scenarios, our automated ranking system even outpaces expert assessments, especially for test gaps in central code—for non-developers an opaque code property. <italic>Conclusion. This research underscores the industrial need of test gap risk estimation techniques to assist test management and quality assurance teams in identifying and addressing critical test gaps. Our multimethod study shows that even a lightweight prioritization approach helps practitioners to identify high-risk test gaps efficiently and to filter out low-risk test gaps.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":"51 5","pages":"1554-1568"},"PeriodicalIF":6.5000,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10945563/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

Context. Untested code changes, called test gaps, pose a significant risk for software projects. Since test gaps increase the probability of defects, managing test gaps and their individual risk is important, especially for rapidly changing software systems. Objective. This study aims at gaining an understanding of test gaps in industrial practice establishing criteria for precise prioritization of test gaps by their risk, informing practitioners that need to manage, review, and act on larger sets of test gaps. Method. We propose an automated approach for prioritizing test gaps based on key risk criteria. By means of an analysis of 31 historical test gap reviews from 8 industrial software systems of our industrial partners Munich Re and LV 1871, and by conducting semi-structured interviews with the 6 quality engineers that authored the historical test gap reviews, we validate the transferability of the identified risk criteria, such as code criticality and complexity metrics. Results. Our automated approach exhibits a ranking performance equivalent to expert assessments, in that test gaps labelled as risky in historical test gap reviews are prioritized correctly, on average, on the 30th percentile. In some scenarios, our automated ranking system even outpaces expert assessments, especially for test gaps in central code—for non-developers an opaque code property. Conclusion. This research underscores the industrial need of test gap risk estimation techniques to assist test management and quality assurance teams in identifying and addressing critical test gaps. Our multimethod study shows that even a lightweight prioritization approach helps practitioners to identify high-risk test gaps efficiently and to filter out low-risk test gaps.

查看原文本刊更多论文

工业实践中根据风险对测试间隙进行优先排序：自动化方法和多方法研究

上下文。未经测试的代码变更，称为测试缺口，对软件项目构成重大风险。由于测试间隔增加了缺陷的可能性，管理测试间隔和它们各自的风险是很重要的，特别是对于快速变化的软件系统。目标。这项研究的目的是在工业实践中获得对测试差距的理解，根据测试差距的风险为测试差距的精确优先级建立标准，告知从业者需要管理、审查和对更大的测试差距集采取行动。方法。我们提出了一种基于关键风险标准对测试差距进行优先级排序的自动化方法。通过对来自我们的工业合作伙伴Munich Re和LV 1871的8个工业软件系统的31个历史测试差距评审的分析，并通过对撰写历史测试差距评审的6位质量工程师进行半结构化访谈，我们验证了识别风险标准的可转移性，例如代码临界性和复杂性度量。结果。我们的自动化方法显示出与专家评估相当的排名性能，因为在历史测试差距评估中标记为风险的测试差距被正确地优先化，平均而言，在第30个百分位数上。在某些情况下，我们的自动排名系统甚至超过了专家的评估，特别是对于中央代码的测试差距——对于非开发人员来说，这是一个不透明的代码属性。结论。这项研究强调了测试差距风险评估技术的工业需求，以帮助测试管理和质量保证团队识别和处理关键的测试差距。我们的多方法研究表明，即使是轻量级的优先排序方法也可以帮助从业者有效地识别高风险的测试差距，并过滤掉低风险的测试差距。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Software Engineering 工程技术-工程：电子与电气

CiteScore

9.70

自引率

10.80%

发文量

724

审稿时长

6 months

期刊介绍： IEEE Transactions on Software Engineering seeks contributions comprising well-defined theoretical results and empirical studies with potential impacts on software construction, analysis, or management. The scope of this Transactions extends from fundamental mechanisms to the development of principles and their application in specific environments. Specific topic areas include: a) Development and maintenance methods and models: Techniques and principles for specifying, designing, and implementing software systems, encompassing notations and process models. b) Assessment methods: Software tests, validation, reliability models, test and diagnosis procedures, software redundancy, design for error control, and measurements and evaluation of process and product aspects. c) Software project management: Productivity factors, cost models, schedule and organizational issues, and standards. d) Tools and environments: Specific tools, integrated tool environments, associated architectures, databases, and parallel and distributed processing issues. e) System issues: Hardware-software trade-offs. f) State-of-the-art surveys: Syntheses and comprehensive reviews of the historical development within specific areas of interest.