测试驱动开发的系统文献综述的可靠性

IF 4.3 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information and Software Technology Pub Date : 2025-05-02 DOI:10.1016/j.infsof.2025.107762

Fernando Uyaguari , Silvia T. Acuña , John W. Castro , Oscar Dieste , Natalia Juristo

{"title":"测试驱动开发的系统文献综述的可靠性","authors":"Fernando Uyaguari , Silvia T. Acuña , John W. Castro , Oscar Dieste , Natalia Juristo","doi":"10.1016/j.infsof.2025.107762","DOIUrl":null,"url":null,"abstract":"<div><h3>Context</h3><div>Test-driven development (TDD) is a software development technique studied empirically over the last few decades. There are several systematic literature reviews (SLRs) on TDD. The reliability of these studies should not be taken for granted because SLRs are highly dependent on the context and researcher decision-making.</div></div><div><h3>Objective</h3><div>This study determines, analyses and synthesizes the limited overlap between SLRs on TDD and its influence on the conclusions and results with respect to the code quality and developer productivity response variables.</div></div><div><h3>Method</h3><div>A tertiary study was conducted to source SLRs on TDD from the scientific literature, and the primary studies referenced in each SLR were analysed. We compared SLRs with similar objectives, SLRs with similar response variables, and all SLRs. We analysed the differences between the selected primary studies and their impact on the conclusions and results.</div></div><div><h3>Results</h3><div>The overlap between SLRs with similar response variables (54 %) is greater than between SLRs with similar objectives (36 %). Only three per cent of the primary studies are included in all eight analysed SLRs. Conclusions regarding external quality and productivity may vary across the SLRs on TDD. While we found that SLR results are similar, these results may differ when authors classify primary studies by experiments and case studies.</div></div><div><h3>Conclusion</h3><div>SLRs with similar response variables tend to be more repeatable than SLRs with similar objectives and SLRs addressing the same topic. The SLR authors’ criteria with respect to the consistency of evidence may influence the conclusions of SLRs on TDD. The results of SLRs where all primary studies count equally appear to be consistent. The SLR authors’ criteria for selecting primary studies may influence the results classified by case studies and experiments.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"184 ","pages":"Article 107762"},"PeriodicalIF":4.3000,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Reliability of systematic literature reviews on test-driven development\",\"authors\":\"Fernando Uyaguari , Silvia T. Acuña , John W. Castro , Oscar Dieste , Natalia Juristo\",\"doi\":\"10.1016/j.infsof.2025.107762\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Context</h3><div>Test-driven development (TDD) is a software development technique studied empirically over the last few decades. There are several systematic literature reviews (SLRs) on TDD. The reliability of these studies should not be taken for granted because SLRs are highly dependent on the context and researcher decision-making.</div></div><div><h3>Objective</h3><div>This study determines, analyses and synthesizes the limited overlap between SLRs on TDD and its influence on the conclusions and results with respect to the code quality and developer productivity response variables.</div></div><div><h3>Method</h3><div>A tertiary study was conducted to source SLRs on TDD from the scientific literature, and the primary studies referenced in each SLR were analysed. We compared SLRs with similar objectives, SLRs with similar response variables, and all SLRs. We analysed the differences between the selected primary studies and their impact on the conclusions and results.</div></div><div><h3>Results</h3><div>The overlap between SLRs with similar response variables (54 %) is greater than between SLRs with similar objectives (36 %). Only three per cent of the primary studies are included in all eight analysed SLRs. Conclusions regarding external quality and productivity may vary across the SLRs on TDD. While we found that SLR results are similar, these results may differ when authors classify primary studies by experiments and case studies.</div></div><div><h3>Conclusion</h3><div>SLRs with similar response variables tend to be more repeatable than SLRs with similar objectives and SLRs addressing the same topic. The SLR authors’ criteria with respect to the consistency of evidence may influence the conclusions of SLRs on TDD. The results of SLRs where all primary studies count equally appear to be consistent. The SLR authors’ criteria for selecting primary studies may influence the results classified by case studies and experiments.</div></div>\",\"PeriodicalId\":54983,\"journal\":{\"name\":\"Information and Software Technology\",\"volume\":\"184 \",\"pages\":\"Article 107762\"},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2025-05-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information and Software Technology\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0950584925001016\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information and Software Technology","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950584925001016","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

背景测试驱动开发（TDD）是在过去几十年里被实证研究的一种软件开发技术。关于TDD有一些系统的文献综述（slr）。这些研究的可靠性不应该被认为是理所当然的，因为单反高度依赖于环境和研究者的决策。本研究确定、分析和综合了单反在TDD上的有限重叠及其对代码质量和开发人员生产力响应变量的结论和结果的影响。方法从科学文献中寻找关于TDD的单反文献，并对各单反文献中引用的主要研究进行分析。我们比较了具有相似物镜的单反、具有相似响应变量的单反和所有单反。我们分析了选定的主要研究之间的差异及其对结论和结果的影响。结果具有相似响应变量的单反之间的重叠（54%）大于具有相似物镜的单反之间的重叠（36%）。只有3%的主要研究被包括在所有8个被分析的单反中。关于外部质量和生产力的结论可能在TDD的slr中有所不同。虽然我们发现单反结果是相似的，但当作者根据实验和案例研究对原始研究进行分类时，这些结果可能会有所不同。结论具有相似响应变量的单反比具有相似物镜和相同主题的单反具有更高的重复性。单反作者关于证据一致性的标准可能会影响单反对TDD的结论。所有原始研究的结果都是一致的。SLR作者选择主要研究的标准可能会影响案例研究和实验分类的结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Reliability of systematic literature reviews on test-driven development

Context

Test-driven development (TDD) is a software development technique studied empirically over the last few decades. There are several systematic literature reviews (SLRs) on TDD. The reliability of these studies should not be taken for granted because SLRs are highly dependent on the context and researcher decision-making.

Objective

This study determines, analyses and synthesizes the limited overlap between SLRs on TDD and its influence on the conclusions and results with respect to the code quality and developer productivity response variables.

Method

A tertiary study was conducted to source SLRs on TDD from the scientific literature, and the primary studies referenced in each SLR were analysed. We compared SLRs with similar objectives, SLRs with similar response variables, and all SLRs. We analysed the differences between the selected primary studies and their impact on the conclusions and results.

Results

The overlap between SLRs with similar response variables (54 %) is greater than between SLRs with similar objectives (36 %). Only three per cent of the primary studies are included in all eight analysed SLRs. Conclusions regarding external quality and productivity may vary across the SLRs on TDD. While we found that SLR results are similar, these results may differ when authors classify primary studies by experiments and case studies.

Conclusion

SLRs with similar response variables tend to be more repeatable than SLRs with similar objectives and SLRs addressing the same topic. The SLR authors’ criteria with respect to the consistency of evidence may influence the conclusions of SLRs on TDD. The results of SLRs where all primary studies count equally appear to be consistent. The SLR authors’ criteria for selecting primary studies may influence the results classified by case studies and experiments.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Information and Software Technology 工程技术-计算机：软件工程

CiteScore

9.10

自引率

7.70%

发文量

164

审稿时长

9.6 weeks

期刊介绍： Information and Software Technology is the international archival journal focusing on research and experience that contributes to the improvement of software development practices. The journal''s scope includes methods and techniques to better engineer software and manage its development. Articles submitted for review should have a clear component of software engineering or address ways to improve the engineering and management of software development. Areas covered by the journal include: • Software management, quality and metrics, • Software processes, • Software architecture, modelling, specification, design and programming • Functional and non-functional software requirements • Software testing and verification & validation • Empirical studies of all aspects of engineering and managing software development Short Communications is a new section dedicated to short papers addressing new ideas, controversial opinions, "Negative" results and much more. Read the Guide for authors for more information. The journal encourages and welcomes submissions of systematic literature studies (reviews and maps) within the scope of the journal. Information and Software Technology is the premiere outlet for systematic literature studies in software engineering.