Wenwei Lan , Chen Huang , Tingting Yu , Li Li , Zhanqi Cui
{"title":"BaSFuzz: Fuzz testing based on difference analysis for seed bytes","authors":"Wenwei Lan , Chen Huang , Tingting Yu , Li Li , Zhanqi Cui","doi":"10.1016/j.jss.2025.112340","DOIUrl":null,"url":null,"abstract":"<div><div>Coverage-guided Greybox Fuzzing (CGF) is one of the most effective dynamic software testing techniques, which focus on improving the code coverage. The methodology automatically generates new offspring test cases by mutating existing test cases and analyzing program execution, preserving the interesting test cases as seeds for subsequent mutations. However, existing CGF tools often neglect the similarity between seeds. The mutation of similar seeds can yield a multitude of similar offspring test cases, subsequently executing similar code segments of the program under test. This challenge hinders the improvement of code coverage, consequently impacting the efficiency of fuzz testing.</div><div>To address this issue, this paper proposes a fuzz testing method BaSFuzz based on difference analysis for seed bytes. The method leverages both byte similarity and structure similarity to analyze the differences between seed bytes. Subsequently, it computes a similarity score for each seed and reorders the seed queue in ascending order of similarity scores.</div><div>Based on this method, a prototype tool is developed and compared with AFL, AFLFast, MOpt, AFL++-Hier and HTFuzz on 12 target programs. The experimental results indicate that BaSFuzz achieved 190.14%, 143.9%, 10.93%, 374.85% and 11.79% more edge coverage compared to the five tools, respectively. Additionally, BaSFuzz triggered unique crashes 3.57 times, 1.46 times, 42.38%, 2.85 times, and 33.44% more than the five tools, respectively.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"222 ","pages":"Article 112340"},"PeriodicalIF":3.7000,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Systems and Software","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0164121225000081","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
Coverage-guided Greybox Fuzzing (CGF) is one of the most effective dynamic software testing techniques, which focus on improving the code coverage. The methodology automatically generates new offspring test cases by mutating existing test cases and analyzing program execution, preserving the interesting test cases as seeds for subsequent mutations. However, existing CGF tools often neglect the similarity between seeds. The mutation of similar seeds can yield a multitude of similar offspring test cases, subsequently executing similar code segments of the program under test. This challenge hinders the improvement of code coverage, consequently impacting the efficiency of fuzz testing.
To address this issue, this paper proposes a fuzz testing method BaSFuzz based on difference analysis for seed bytes. The method leverages both byte similarity and structure similarity to analyze the differences between seed bytes. Subsequently, it computes a similarity score for each seed and reorders the seed queue in ascending order of similarity scores.
Based on this method, a prototype tool is developed and compared with AFL, AFLFast, MOpt, AFL++-Hier and HTFuzz on 12 target programs. The experimental results indicate that BaSFuzz achieved 190.14%, 143.9%, 10.93%, 374.85% and 11.79% more edge coverage compared to the five tools, respectively. Additionally, BaSFuzz triggered unique crashes 3.57 times, 1.46 times, 42.38%, 2.85 times, and 33.44% more than the five tools, respectively.
期刊介绍:
The Journal of Systems and Software publishes papers covering all aspects of software engineering and related hardware-software-systems issues. All articles should include a validation of the idea presented, e.g. through case studies, experiments, or systematic comparisons with other approaches already in practice. Topics of interest include, but are not limited to:
•Methods and tools for, and empirical studies on, software requirements, design, architecture, verification and validation, maintenance and evolution
•Agile, model-driven, service-oriented, open source and global software development
•Approaches for mobile, multiprocessing, real-time, distributed, cloud-based, dependable and virtualized systems
•Human factors and management concerns of software development
•Data management and big data issues of software systems
•Metrics and evaluation, data mining of software development resources
•Business and economic aspects of software development processes
The journal welcomes state-of-the-art surveys and reports of practical experience for all of these topics.