BaSFuzz: Fuzz testing based on difference analysis for seed bytes

IF 3.7 2区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Journal of Systems and Software Pub Date : 2025-01-22 DOI:10.1016/j.jss.2025.112340

Wenwei Lan , Chen Huang , Tingting Yu , Li Li , Zhanqi Cui

{"title":"BaSFuzz: Fuzz testing based on difference analysis for seed bytes","authors":"Wenwei Lan , Chen Huang , Tingting Yu , Li Li , Zhanqi Cui","doi":"10.1016/j.jss.2025.112340","DOIUrl":null,"url":null,"abstract":"<div><div>Coverage-guided Greybox Fuzzing (CGF) is one of the most effective dynamic software testing techniques, which focus on improving the code coverage. The methodology automatically generates new offspring test cases by mutating existing test cases and analyzing program execution, preserving the interesting test cases as seeds for subsequent mutations. However, existing CGF tools often neglect the similarity between seeds. The mutation of similar seeds can yield a multitude of similar offspring test cases, subsequently executing similar code segments of the program under test. This challenge hinders the improvement of code coverage, consequently impacting the efficiency of fuzz testing.</div><div>To address this issue, this paper proposes a fuzz testing method BaSFuzz based on difference analysis for seed bytes. The method leverages both byte similarity and structure similarity to analyze the differences between seed bytes. Subsequently, it computes a similarity score for each seed and reorders the seed queue in ascending order of similarity scores.</div><div>Based on this method, a prototype tool is developed and compared with AFL, AFLFast, MOpt, AFL++-Hier and HTFuzz on 12 target programs. The experimental results indicate that BaSFuzz achieved 190.14%, 143.9%, 10.93%, 374.85% and 11.79% more edge coverage compared to the five tools, respectively. Additionally, BaSFuzz triggered unique crashes 3.57 times, 1.46 times, 42.38%, 2.85 times, and 33.44% more than the five tools, respectively.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"222 ","pages":"Article 112340"},"PeriodicalIF":3.7000,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Systems and Software","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0164121225000081","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

Coverage-guided Greybox Fuzzing (CGF) is one of the most effective dynamic software testing techniques, which focus on improving the code coverage. The methodology automatically generates new offspring test cases by mutating existing test cases and analyzing program execution, preserving the interesting test cases as seeds for subsequent mutations. However, existing CGF tools often neglect the similarity between seeds. The mutation of similar seeds can yield a multitude of similar offspring test cases, subsequently executing similar code segments of the program under test. This challenge hinders the improvement of code coverage, consequently impacting the efficiency of fuzz testing.

To address this issue, this paper proposes a fuzz testing method BaSFuzz based on difference analysis for seed bytes. The method leverages both byte similarity and structure similarity to analyze the differences between seed bytes. Subsequently, it computes a similarity score for each seed and reorders the seed queue in ascending order of similarity scores.

Based on this method, a prototype tool is developed and compared with AFL, AFLFast, MOpt, AFL++-Hier and HTFuzz on 12 target programs. The experimental results indicate that BaSFuzz achieved 190.14%, 143.9%, 10.93%, 374.85% and 11.79% more edge coverage compared to the five tools, respectively. Additionally, BaSFuzz triggered unique crashes 3.57 times, 1.46 times, 42.38%, 2.85 times, and 33.44% more than the five tools, respectively.

Abstract Image

查看原文本刊更多论文

BaSFuzz：基于种子字节差异分析的模糊测试

覆盖率引导的灰盒模糊测试（Greybox Fuzzing， CGF）是一种最有效的动态软件测试技术，它关注于提高代码覆盖率。该方法通过改变现有的测试用例和分析程序执行来自动生成新的后代测试用例，保留有趣的测试用例作为后续变化的种子。然而，现有的CGF工具往往忽略了种子之间的相似性。相似种子的突变可以产生大量相似的后代测试用例，随后执行被测程序的相似代码段。这个挑战阻碍了代码覆盖率的改进，从而影响了模糊测试的效率。针对这一问题，本文提出了一种基于种子字节差异分析的模糊测试方法BaSFuzz。该方法利用字节相似性和结构相似性来分析种子字节之间的差异。然后，计算每个种子的相似度分数，并按相似度分数升序对种子队列进行重新排序。基于该方法，开发了一个原型工具，并在12个目标程序上与AFL、AFLFast、MOpt、afl++ -Hier和HTFuzz进行了比较。实验结果表明，与这5种工具相比，BaSFuzz的边缘覆盖率分别提高了190.14%、143.9%、10.93%、374.85%和11.79%。此外，与这五种工具相比，BaSFuzz触发独特崩溃的次数分别高出3.57次、1.46次、42.38%、2.85次和33.44%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Systems and Software 工程技术-计算机：理论方法

CiteScore

8.60

自引率

5.70%

发文量

193

审稿时长

16 weeks

期刊介绍： The Journal of Systems and Software publishes papers covering all aspects of software engineering and related hardware-software-systems issues. All articles should include a validation of the idea presented, e.g. through case studies, experiments, or systematic comparisons with other approaches already in practice. Topics of interest include, but are not limited to: •Methods and tools for, and empirical studies on, software requirements, design, architecture, verification and validation, maintenance and evolution •Agile, model-driven, service-oriented, open source and global software development •Approaches for mobile, multiprocessing, real-time, distributed, cloud-based, dependable and virtualized systems •Human factors and management concerns of software development •Data management and big data issues of software systems •Metrics and evaluation, data mining of software development resources •Business and economic aspects of software development processes The journal welcomes state-of-the-art surveys and reports of practical experience for all of these topics.