Fernando Uyaguari , Silvia T. Acuña , John W. Castro , Oscar Dieste , Natalia Juristo
{"title":"测试驱动开发的系统文献综述的可靠性","authors":"Fernando Uyaguari , Silvia T. Acuña , John W. Castro , Oscar Dieste , Natalia Juristo","doi":"10.1016/j.infsof.2025.107762","DOIUrl":null,"url":null,"abstract":"<div><h3>Context</h3><div>Test-driven development (TDD) is a software development technique studied empirically over the last few decades. There are several systematic literature reviews (SLRs) on TDD. The reliability of these studies should not be taken for granted because SLRs are highly dependent on the context and researcher decision-making.</div></div><div><h3>Objective</h3><div>This study determines, analyses and synthesizes the limited overlap between SLRs on TDD and its influence on the conclusions and results with respect to the code quality and developer productivity response variables.</div></div><div><h3>Method</h3><div>A tertiary study was conducted to source SLRs on TDD from the scientific literature, and the primary studies referenced in each SLR were analysed. We compared SLRs with similar objectives, SLRs with similar response variables, and all SLRs. We analysed the differences between the selected primary studies and their impact on the conclusions and results.</div></div><div><h3>Results</h3><div>The overlap between SLRs with similar response variables (54 %) is greater than between SLRs with similar objectives (36 %). Only three per cent of the primary studies are included in all eight analysed SLRs. Conclusions regarding external quality and productivity may vary across the SLRs on TDD. While we found that SLR results are similar, these results may differ when authors classify primary studies by experiments and case studies.</div></div><div><h3>Conclusion</h3><div>SLRs with similar response variables tend to be more repeatable than SLRs with similar objectives and SLRs addressing the same topic. The SLR authors’ criteria with respect to the consistency of evidence may influence the conclusions of SLRs on TDD. The results of SLRs where all primary studies count equally appear to be consistent. The SLR authors’ criteria for selecting primary studies may influence the results classified by case studies and experiments.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"184 ","pages":"Article 107762"},"PeriodicalIF":4.3000,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Reliability of systematic literature reviews on test-driven development\",\"authors\":\"Fernando Uyaguari , Silvia T. Acuña , John W. Castro , Oscar Dieste , Natalia Juristo\",\"doi\":\"10.1016/j.infsof.2025.107762\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Context</h3><div>Test-driven development (TDD) is a software development technique studied empirically over the last few decades. There are several systematic literature reviews (SLRs) on TDD. The reliability of these studies should not be taken for granted because SLRs are highly dependent on the context and researcher decision-making.</div></div><div><h3>Objective</h3><div>This study determines, analyses and synthesizes the limited overlap between SLRs on TDD and its influence on the conclusions and results with respect to the code quality and developer productivity response variables.</div></div><div><h3>Method</h3><div>A tertiary study was conducted to source SLRs on TDD from the scientific literature, and the primary studies referenced in each SLR were analysed. We compared SLRs with similar objectives, SLRs with similar response variables, and all SLRs. We analysed the differences between the selected primary studies and their impact on the conclusions and results.</div></div><div><h3>Results</h3><div>The overlap between SLRs with similar response variables (54 %) is greater than between SLRs with similar objectives (36 %). Only three per cent of the primary studies are included in all eight analysed SLRs. Conclusions regarding external quality and productivity may vary across the SLRs on TDD. While we found that SLR results are similar, these results may differ when authors classify primary studies by experiments and case studies.</div></div><div><h3>Conclusion</h3><div>SLRs with similar response variables tend to be more repeatable than SLRs with similar objectives and SLRs addressing the same topic. The SLR authors’ criteria with respect to the consistency of evidence may influence the conclusions of SLRs on TDD. The results of SLRs where all primary studies count equally appear to be consistent. The SLR authors’ criteria for selecting primary studies may influence the results classified by case studies and experiments.</div></div>\",\"PeriodicalId\":54983,\"journal\":{\"name\":\"Information and Software Technology\",\"volume\":\"184 \",\"pages\":\"Article 107762\"},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2025-05-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information and Software Technology\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0950584925001016\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information and Software Technology","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950584925001016","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Reliability of systematic literature reviews on test-driven development
Context
Test-driven development (TDD) is a software development technique studied empirically over the last few decades. There are several systematic literature reviews (SLRs) on TDD. The reliability of these studies should not be taken for granted because SLRs are highly dependent on the context and researcher decision-making.
Objective
This study determines, analyses and synthesizes the limited overlap between SLRs on TDD and its influence on the conclusions and results with respect to the code quality and developer productivity response variables.
Method
A tertiary study was conducted to source SLRs on TDD from the scientific literature, and the primary studies referenced in each SLR were analysed. We compared SLRs with similar objectives, SLRs with similar response variables, and all SLRs. We analysed the differences between the selected primary studies and their impact on the conclusions and results.
Results
The overlap between SLRs with similar response variables (54 %) is greater than between SLRs with similar objectives (36 %). Only three per cent of the primary studies are included in all eight analysed SLRs. Conclusions regarding external quality and productivity may vary across the SLRs on TDD. While we found that SLR results are similar, these results may differ when authors classify primary studies by experiments and case studies.
Conclusion
SLRs with similar response variables tend to be more repeatable than SLRs with similar objectives and SLRs addressing the same topic. The SLR authors’ criteria with respect to the consistency of evidence may influence the conclusions of SLRs on TDD. The results of SLRs where all primary studies count equally appear to be consistent. The SLR authors’ criteria for selecting primary studies may influence the results classified by case studies and experiments.
期刊介绍:
Information and Software Technology is the international archival journal focusing on research and experience that contributes to the improvement of software development practices. The journal''s scope includes methods and techniques to better engineer software and manage its development. Articles submitted for review should have a clear component of software engineering or address ways to improve the engineering and management of software development. Areas covered by the journal include:
• Software management, quality and metrics,
• Software processes,
• Software architecture, modelling, specification, design and programming
• Functional and non-functional software requirements
• Software testing and verification & validation
• Empirical studies of all aspects of engineering and managing software development
Short Communications is a new section dedicated to short papers addressing new ideas, controversial opinions, "Negative" results and much more. Read the Guide for authors for more information.
The journal encourages and welcomes submissions of systematic literature studies (reviews and maps) within the scope of the journal. Information and Software Technology is the premiere outlet for systematic literature studies in software engineering.