NexuSym: Marrying symbolic path finders with large language models

IF 3.1 2区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Automated Software Engineering Pub Date : 2025-06-07 DOI:10.1007/s10515-025-00529-1

Jiayi Wang, Ping Yu, Yi Qin, Yanyan Jiang, Yuan Yao, Xiaoxing Ma

{"title":"NexuSym: Marrying symbolic path finders with large language models","authors":"Jiayi Wang, Ping Yu, Yi Qin, Yanyan Jiang, Yuan Yao, Xiaoxing Ma","doi":"10.1007/s10515-025-00529-1","DOIUrl":null,"url":null,"abstract":"<div>Symbolic execution is a powerful technique for automated test case generation, ensuring comprehensive coverage of potential scenarios. However, it often struggles with complex, deep paths due to path explosion. Conversely, large language models (LLMs) utilize vast training data to generate test cases that can uncover intricate program behaviors that symbolic execution might miss. Despite their complementary strengths, integrating the systematic nature of symbolic execution with the creative capabilities of LLMs presents a significant challenge. We introduce NexuSym, an innovative tool that integrates symbolic execution with LLMs to facilitate the automatic generation of test cases. To effectively bridge the gap between these two approaches, we have developed a test case reducer, which normalizes the LLM-generated test cases to make them compatible with symbolic execution. Additionally, we propose a search space summarizer, which abstracts and condenses the search space explored by symbolic execution, enabling the LLM to focus on the most promising areas for further exploration. We instantiated NexuSym on KLEE and ChatGPT. Our evaluation of NexuSym involved 99 coreutils programs and 9 large GNU programs. The experimental results demonstrate that NexuSym significantly enhances program test coverage, with improvements of up to 20% in certain cases. Furthermore, we conducted an analysis of the monetary costs associated with using the LLM API, revealing that NexuSym is a highly cost-effective solution.</div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"32 2","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2025-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Automated Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10515-025-00529-1","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

Symbolic execution is a powerful technique for automated test case generation, ensuring comprehensive coverage of potential scenarios. However, it often struggles with complex, deep paths due to path explosion. Conversely, large language models (LLMs) utilize vast training data to generate test cases that can uncover intricate program behaviors that symbolic execution might miss. Despite their complementary strengths, integrating the systematic nature of symbolic execution with the creative capabilities of LLMs presents a significant challenge. We introduce NexuSym, an innovative tool that integrates symbolic execution with LLMs to facilitate the automatic generation of test cases. To effectively bridge the gap between these two approaches, we have developed a test case reducer, which normalizes the LLM-generated test cases to make them compatible with symbolic execution. Additionally, we propose a search space summarizer, which abstracts and condenses the search space explored by symbolic execution, enabling the LLM to focus on the most promising areas for further exploration. We instantiated NexuSym on KLEE and ChatGPT. Our evaluation of NexuSym involved 99 coreutils programs and 9 large GNU programs. The experimental results demonstrate that NexuSym significantly enhances program test coverage, with improvements of up to 20% in certain cases. Furthermore, we conducted an analysis of the monetary costs associated with using the LLM API, revealing that NexuSym is a highly cost-effective solution.

Abstract Image

查看原文本刊更多论文

NexuSym：将符号寻路器与大型语言模型相结合

符号执行是自动化测试用例生成的强大技术，确保了潜在场景的全面覆盖。然而，由于路径爆炸，它经常与复杂而深刻的路径作斗争。相反，大型语言模型（llm）利用大量的训练数据来生成测试用例，这些测试用例可以揭示符号执行可能错过的复杂程序行为，尽管它们具有互补的优势，但将符号执行的系统性质与llm的创造性能力相集成是一个重大挑战。我们介绍NexuSym，一个创新的工具，集成了符号执行与llm，以促进测试用例的自动生成。为了有效地弥合这两种方法之间的差距，我们开发了一个测试用例减速器，它规范了llm生成的测试用例，使它们与符号执行兼容。此外，我们提出了一个搜索空间摘要器，它抽象和压缩了符号执行所探索的搜索空间，使LLM能够专注于最有前途的领域进行进一步的探索。我们在KLEE和ChatGPT上实例化了nexusyum。我们对NexuSym的评估涉及99个内核程序和9个大型GNU程序。实验结果表明，NexuSym显著提高了程序测试覆盖率，在某些情况下提高了20%。此外，我们对使用LLM API的成本进行了分析，发现NexuSym是一种极具成本效益的解决方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Automated Software Engineering 工程技术-计算机：软件工程

CiteScore

4.80

自引率

11.80%

发文量

审稿时长

>12 weeks

期刊介绍： This journal details research, tutorial papers, survey and accounts of significant industrial experience in the foundations, techniques, tools and applications of automated software engineering technology. This includes the study of techniques for constructing, understanding, adapting, and modeling software artifacts and processes. Coverage in Automated Software Engineering examines both automatic systems and collaborative systems as well as computational models of human software engineering activities. In addition, it presents knowledge representations and artificial intelligence techniques applicable to automated software engineering, and formal techniques that support or provide theoretical foundations. The journal also includes reviews of books, software, conferences and workshops.