TestLoter：一个逻辑驱动的框架，用于使用大型语言模型自动生成单元测试和修复错误

IF 1.8 3区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Journal of Computer Languages Pub Date : 2025-07-23 DOI:10.1016/j.cola.2025.101348

Ruofan Yang, Xianghua Xu, Ran Wang

{"title":"TestLoter：一个逻辑驱动的框架，用于使用大型语言模型自动生成单元测试和修复错误","authors":"Ruofan Yang, Xianghua Xu, Ran Wang","doi":"10.1016/j.cola.2025.101348","DOIUrl":null,"url":null,"abstract":"<div><div>Automated unit test generation is a critical technique for improving software quality and development efficiency. However, traditional methods often produce test cases with poor business consistency, while large language model based approaches face two major challenges: a high error rate in generated tests and insufficient code coverage. To address these issues, this paper proposes TestLoter, a logic-driven test generation framework. The core contributions of TestLoter are twofold. First, by integrating the structured analysis capabilities of white-box testing with the functional validation characteristics of black-box testing, we design a logic-driven test generation chain-of-thought that enables deep semantic analysis of code. Second, we establish a hierarchical repair mechanism to systematically correct errors in generated test cases, significantly enhancing the correctness of the test code. Experimental results on nine open-source projects covering various domains, such as data processing and utility libraries, demonstrate that TestLoter achieves 83.6% line coverage and 78% branch coverage. Our approach outperforms both LLM-based methods and traditional search-based software testing techniques in terms of coverage, while also reducing the number of errors in the generated unit test code.</div></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"84 ","pages":"Article 101348"},"PeriodicalIF":1.8000,"publicationDate":"2025-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"TestLoter: A logic-driven framework for automated unit test generation and error repair using large language models\",\"authors\":\"Ruofan Yang, Xianghua Xu, Ran Wang\",\"doi\":\"10.1016/j.cola.2025.101348\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Automated unit test generation is a critical technique for improving software quality and development efficiency. However, traditional methods often produce test cases with poor business consistency, while large language model based approaches face two major challenges: a high error rate in generated tests and insufficient code coverage. To address these issues, this paper proposes TestLoter, a logic-driven test generation framework. The core contributions of TestLoter are twofold. First, by integrating the structured analysis capabilities of white-box testing with the functional validation characteristics of black-box testing, we design a logic-driven test generation chain-of-thought that enables deep semantic analysis of code. Second, we establish a hierarchical repair mechanism to systematically correct errors in generated test cases, significantly enhancing the correctness of the test code. Experimental results on nine open-source projects covering various domains, such as data processing and utility libraries, demonstrate that TestLoter achieves 83.6% line coverage and 78% branch coverage. Our approach outperforms both LLM-based methods and traditional search-based software testing techniques in terms of coverage, while also reducing the number of errors in the generated unit test code.</div></div>\",\"PeriodicalId\":48552,\"journal\":{\"name\":\"Journal of Computer Languages\",\"volume\":\"84 \",\"pages\":\"Article 101348\"},\"PeriodicalIF\":1.8000,\"publicationDate\":\"2025-07-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Computer Languages\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2590118425000346\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computer Languages","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2590118425000346","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

摘要

自动化单元测试生成是提高软件质量和开发效率的关键技术。然而，传统方法经常产生业务一致性差的测试用例，而基于大型语言模型的方法面临两个主要挑战：生成测试中的高错误率和不足的代码覆盖率。为了解决这些问题，本文提出了TestLoter，一个逻辑驱动的测试生成框架。TestLoter的核心贡献有两个方面。首先，通过将白盒测试的结构化分析能力与黑盒测试的功能验证特征集成在一起，我们设计了一个逻辑驱动的测试生成思维链，从而能够对代码进行深入的语义分析。其次，我们建立了一个层次修复机制来系统地纠正生成的测试用例中的错误，显著提高了测试代码的正确性。在涵盖数据处理和实用程序库等多个领域的9个开源项目上的实验结果表明，TestLoter实现了83.6%的行覆盖率和78%的分支覆盖率。我们的方法在覆盖率方面优于基于llm的方法和传统的基于搜索的软件测试技术，同时也减少了生成的单元测试代码中的错误数量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

TestLoter: A logic-driven framework for automated unit test generation and error repair using large language models

Automated unit test generation is a critical technique for improving software quality and development efficiency. However, traditional methods often produce test cases with poor business consistency, while large language model based approaches face two major challenges: a high error rate in generated tests and insufficient code coverage. To address these issues, this paper proposes TestLoter, a logic-driven test generation framework. The core contributions of TestLoter are twofold. First, by integrating the structured analysis capabilities of white-box testing with the functional validation characteristics of black-box testing, we design a logic-driven test generation chain-of-thought that enables deep semantic analysis of code. Second, we establish a hierarchical repair mechanism to systematically correct errors in generated test cases, significantly enhancing the correctness of the test code. Experimental results on nine open-source projects covering various domains, such as data processing and utility libraries, demonstrate that TestLoter achieves 83.6% line coverage and 78% branch coverage. Our approach outperforms both LLM-based methods and traditional search-based software testing techniques in terms of coverage, while also reducing the number of errors in the generated unit test code.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Computer Languages Computer Science-Computer Networks and Communications

CiteScore

5.00

自引率

13.60%

发文量