FieldsFuzz：基于语法感知的突变策略实现高效的模糊测试

IF 4.1 2区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Journal of Systems and Software Pub Date : 2025-07-14 DOI:10.1016/j.jss.2025.112557

Yinghao Su , Dapeng Xiong , Kechang Qian , Yu Wang , Qingyao Zeng

{"title":"FieldsFuzz：基于语法感知的突变策略实现高效的模糊测试","authors":"Yinghao Su , Dapeng Xiong , Kechang Qian , Yu Wang , Qingyao Zeng","doi":"10.1016/j.jss.2025.112557","DOIUrl":null,"url":null,"abstract":"<div><div>A comprehensive understanding of the input format utilized by the testing program is essential for the generation of valid inputs and the enhancement of testing efficacy in fuzz testing. Nevertheless, current format-aware fuzz testing tools predominantly focus on recognizing various functional segments of binary input files, usually overlooking the structural intricacies and dependencies inherent within these files. Furthermore, existing format-aware methodologies that based on comparison and taint analysis exhibit limitations in accurately identifying file fields and types. To mitigate these challenges, this article introduces a novel format-aware fuzz testing tool, termed FieldsFuzz. Initially, FieldsFuzz performs byte-level taint analysis on significant seed inputs during program execution to derive a set of input byte instructions, thereby identifying input file structures and dependencies, and constructing a file format tree. During the mutation phase, FieldsFuzz traverses the file format tree to ascertain field dependencies, executes fields and dependencies based mutations to enhance the efficiency of effective seed generation, and introduces random modifications to the file structure to uncover previously unknown vulnerabilities. An evaluation of FieldsFuzz was conducted using twelve distinct format input programs, revealing that it surpasses leading fuzzing tools (including AFL, AFL＋＋, WEIZZ, ProFuzzer, and NestFuzz) in terms of format recognition accuracy, code coverage, and the detection of security vulnerabilities.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"230 ","pages":"Article 112557"},"PeriodicalIF":4.1000,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"FieldsFuzz: Implement efficient fuzzing based on grammar-aware mutation strategy\",\"authors\":\"Yinghao Su , Dapeng Xiong , Kechang Qian , Yu Wang , Qingyao Zeng\",\"doi\":\"10.1016/j.jss.2025.112557\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>A comprehensive understanding of the input format utilized by the testing program is essential for the generation of valid inputs and the enhancement of testing efficacy in fuzz testing. Nevertheless, current format-aware fuzz testing tools predominantly focus on recognizing various functional segments of binary input files, usually overlooking the structural intricacies and dependencies inherent within these files. Furthermore, existing format-aware methodologies that based on comparison and taint analysis exhibit limitations in accurately identifying file fields and types. To mitigate these challenges, this article introduces a novel format-aware fuzz testing tool, termed FieldsFuzz. Initially, FieldsFuzz performs byte-level taint analysis on significant seed inputs during program execution to derive a set of input byte instructions, thereby identifying input file structures and dependencies, and constructing a file format tree. During the mutation phase, FieldsFuzz traverses the file format tree to ascertain field dependencies, executes fields and dependencies based mutations to enhance the efficiency of effective seed generation, and introduces random modifications to the file structure to uncover previously unknown vulnerabilities. An evaluation of FieldsFuzz was conducted using twelve distinct format input programs, revealing that it surpasses leading fuzzing tools (including AFL, AFL＋＋, WEIZZ, ProFuzzer, and NestFuzz) in terms of format recognition accuracy, code coverage, and the detection of security vulnerabilities.</div></div>\",\"PeriodicalId\":51099,\"journal\":{\"name\":\"Journal of Systems and Software\",\"volume\":\"230 \",\"pages\":\"Article 112557\"},\"PeriodicalIF\":4.1000,\"publicationDate\":\"2025-07-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Systems and Software\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0164121225002262\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Systems and Software","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0164121225002262","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

摘要

对测试程序使用的输入格式的全面理解对于生成有效输入和增强模糊测试中的测试效率是必不可少的。然而，当前的格式感知模糊测试工具主要关注于识别二进制输入文件的各种功能段，通常忽略了这些文件中固有的结构复杂性和依赖性。此外，基于比较和污点分析的现有格式感知方法在准确识别文件字段和类型方面存在局限性。为了减轻这些挑战，本文介绍了一种新的格式感知模糊测试工具，称为FieldsFuzz。最初，FieldsFuzz在程序执行期间对重要的种子输入执行字节级污染分析，以派生一组输入字节指令，从而识别输入文件结构和依赖关系，并构造文件格式树。在突变阶段，FieldsFuzz遍历文件格式树以确定字段依赖关系，执行基于字段和依赖关系的突变以提高有效种子生成的效率，并对文件结构引入随机修改以发现以前未知的漏洞。使用12种不同的格式输入程序对FieldsFuzz进行了评估，结果表明，它在格式识别准确性、代码覆盖率和安全漏洞检测方面超过了领先的模糊测试工具（包括AFL、afl++、WEIZZ、ProFuzzer和NestFuzz）。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

FieldsFuzz: Implement efficient fuzzing based on grammar-aware mutation strategy

A comprehensive understanding of the input format utilized by the testing program is essential for the generation of valid inputs and the enhancement of testing efficacy in fuzz testing. Nevertheless, current format-aware fuzz testing tools predominantly focus on recognizing various functional segments of binary input files, usually overlooking the structural intricacies and dependencies inherent within these files. Furthermore, existing format-aware methodologies that based on comparison and taint analysis exhibit limitations in accurately identifying file fields and types. To mitigate these challenges, this article introduces a novel format-aware fuzz testing tool, termed FieldsFuzz. Initially, FieldsFuzz performs byte-level taint analysis on significant seed inputs during program execution to derive a set of input byte instructions, thereby identifying input file structures and dependencies, and constructing a file format tree. During the mutation phase, FieldsFuzz traverses the file format tree to ascertain field dependencies, executes fields and dependencies based mutations to enhance the efficiency of effective seed generation, and introduces random modifications to the file structure to uncover previously unknown vulnerabilities. An evaluation of FieldsFuzz was conducted using twelve distinct format input programs, revealing that it surpasses leading fuzzing tools (including AFL, AFL＋＋, WEIZZ, ProFuzzer, and NestFuzz) in terms of format recognition accuracy, code coverage, and the detection of security vulnerabilities.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Systems and Software 工程技术-计算机：理论方法

CiteScore

8.60

自引率

5.70%

发文量

193

审稿时长

16 weeks

期刊介绍： The Journal of Systems and Software publishes papers covering all aspects of software engineering and related hardware-software-systems issues. All articles should include a validation of the idea presented, e.g. through case studies, experiments, or systematic comparisons with other approaches already in practice. Topics of interest include, but are not limited to: •Methods and tools for, and empirical studies on, software requirements, design, architecture, verification and validation, maintenance and evolution •Agile, model-driven, service-oriented, open source and global software development •Approaches for mobile, multiprocessing, real-time, distributed, cloud-based, dependable and virtualized systems •Human factors and management concerns of software development •Data management and big data issues of software systems •Metrics and evaluation, data mining of software development resources •Business and economic aspects of software development processes The journal welcomes state-of-the-art surveys and reports of practical experience for all of these topics.