基于神经程序平滑的模糊化评价与改进

2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE) Pub Date : 2022-05-01 DOI:10.1145/3510003.3510089

Mingyuan Wu, Lingixao Jiang, Jiahong Xiang, Yuqun Zhang, Guowei Yang, Huixin Ma, Sen Nie, Shi Wu, Heming Cui, Lingming Zhang

{"title":"基于神经程序平滑的模糊化评价与改进","authors":"Mingyuan Wu, Lingixao Jiang, Jiahong Xiang, Yuqun Zhang, Guowei Yang, Huixin Ma, Sen Nie, Shi Wu, Heming Cui, Lingming Zhang","doi":"10.1145/3510003.3510089","DOIUrl":null,"url":null,"abstract":"Fuzzing nowadays has been commonly modeled as an optimization problem, e.g., maximizing code coverage under a given time budget via typical search-based solutions such as evolutionary algorithms. However, such solutions are widely argued to cause inefficient computing resource usage, i.e., inefficient mutations. To address this issue, two neural program-smoothing-based fuzzers, Neuzz and MTFuzz, have been recently proposed to approximate program branching behaviors via neural network models, which input byte sequences of a seed and output vectors representing program branching behaviors. Moreover, assuming that mutating the bytes with larger gradients can better explore branching behaviors, they develop strategies to mutate such bytes for generating new seeds as test cases. Meanwhile, although they have been shown to be effective in the original papers, they were only evaluated upon a limited dataset. In addition, it is still unclear how their key technical components and whether other factors can impact fuzzing performance. To further investigate neural program-smoothing-based fuzzing, we first construct a large-scale benchmark suite with a total of 28 popular open-source projects. Then, we extensively evaluate Neuzz and MTFuzz on such benchmarks. The evaluation results suggest that their edge coverage performance can be unstable. Moreover, neither neural network models nor mutation strategies can be consistently effective, and the power of their gradient-guidance mechanisms have been compromised. Inspired by such findings, we propose a simplistic technique, PreFuzz, which improves neural program-smoothing-based fuzzers with a resource-efficient edge selection mechanism to enhance their gradient guidance and a probabilistic byte selection mechanism to further boost mutation effectiveness. Our evaluation results indicate that PreFuzz can significantly increase the edge coverage of Neuzz/MTFuzz, and also reveal multiple practical guidelines to advance future research on neural program-smoothing-based fuzzing.","PeriodicalId":202896,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":"{\"title\":\"Evaluating and Improving Neural Program-Smoothing-based Fuzzing\",\"authors\":\"Mingyuan Wu, Lingixao Jiang, Jiahong Xiang, Yuqun Zhang, Guowei Yang, Huixin Ma, Sen Nie, Shi Wu, Heming Cui, Lingming Zhang\",\"doi\":\"10.1145/3510003.3510089\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Fuzzing nowadays has been commonly modeled as an optimization problem, e.g., maximizing code coverage under a given time budget via typical search-based solutions such as evolutionary algorithms. However, such solutions are widely argued to cause inefficient computing resource usage, i.e., inefficient mutations. To address this issue, two neural program-smoothing-based fuzzers, Neuzz and MTFuzz, have been recently proposed to approximate program branching behaviors via neural network models, which input byte sequences of a seed and output vectors representing program branching behaviors. Moreover, assuming that mutating the bytes with larger gradients can better explore branching behaviors, they develop strategies to mutate such bytes for generating new seeds as test cases. Meanwhile, although they have been shown to be effective in the original papers, they were only evaluated upon a limited dataset. In addition, it is still unclear how their key technical components and whether other factors can impact fuzzing performance. To further investigate neural program-smoothing-based fuzzing, we first construct a large-scale benchmark suite with a total of 28 popular open-source projects. Then, we extensively evaluate Neuzz and MTFuzz on such benchmarks. The evaluation results suggest that their edge coverage performance can be unstable. Moreover, neither neural network models nor mutation strategies can be consistently effective, and the power of their gradient-guidance mechanisms have been compromised. Inspired by such findings, we propose a simplistic technique, PreFuzz, which improves neural program-smoothing-based fuzzers with a resource-efficient edge selection mechanism to enhance their gradient guidance and a probabilistic byte selection mechanism to further boost mutation effectiveness. Our evaluation results indicate that PreFuzz can significantly increase the edge coverage of Neuzz/MTFuzz, and also reveal multiple practical guidelines to advance future research on neural program-smoothing-based fuzzing.\",\"PeriodicalId\":202896,\"journal\":{\"name\":\"2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE)\",\"volume\":\"46 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"15\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3510003.3510089\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3510003.3510089","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 15

摘要

如今，模糊测试通常被建模为一个优化问题，例如，在给定的时间预算下，通过典型的基于搜索的解决方案(如进化算法)最大化代码覆盖率。然而，人们普遍认为这种解决方案会导致低效的计算资源使用，即低效的突变。为了解决这个问题，最近提出了两个基于神经程序平滑的模糊器，Neuzz和MTFuzz，通过神经网络模型来近似程序分支行为，这些模型输入种子的字节序列和输出代表程序分支行为的向量。此外，假设改变具有较大梯度的字节可以更好地探索分支行为，他们开发了策略来改变这些字节以产生新的种子作为测试用例。同时，尽管它们在原始论文中被证明是有效的，但它们仅在有限的数据集上进行了评估。此外，目前尚不清楚它们的关键技术组件如何以及其他因素是否会影响模糊测试性能。为了进一步研究基于神经程序平滑的模糊，我们首先构建了一个包含28个流行开源项目的大规模基准测试套件。然后，我们在这样的基准上广泛评估Neuzz和MTFuzz。评估结果表明，它们的边缘覆盖性能可能不稳定。此外，无论是神经网络模型还是突变策略都不能始终有效，并且它们的梯度引导机制的能力已经受到损害。受这些发现的启发，我们提出了一种简单的技术PreFuzz，它改进了基于神经程序平滑的模糊器，使用资源高效的边缘选择机制来增强其梯度引导和概率字节选择机制来进一步提高突变有效性。我们的评估结果表明，PreFuzz可以显著增加Neuzz/MTFuzz的边缘覆盖率，并为未来基于神经程序平滑的模糊研究提供了多种实用指南。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Evaluating and Improving Neural Program-Smoothing-based Fuzzing

Fuzzing nowadays has been commonly modeled as an optimization problem, e.g., maximizing code coverage under a given time budget via typical search-based solutions such as evolutionary algorithms. However, such solutions are widely argued to cause inefficient computing resource usage, i.e., inefficient mutations. To address this issue, two neural program-smoothing-based fuzzers, Neuzz and MTFuzz, have been recently proposed to approximate program branching behaviors via neural network models, which input byte sequences of a seed and output vectors representing program branching behaviors. Moreover, assuming that mutating the bytes with larger gradients can better explore branching behaviors, they develop strategies to mutate such bytes for generating new seeds as test cases. Meanwhile, although they have been shown to be effective in the original papers, they were only evaluated upon a limited dataset. In addition, it is still unclear how their key technical components and whether other factors can impact fuzzing performance. To further investigate neural program-smoothing-based fuzzing, we first construct a large-scale benchmark suite with a total of 28 popular open-source projects. Then, we extensively evaluate Neuzz and MTFuzz on such benchmarks. The evaluation results suggest that their edge coverage performance can be unstable. Moreover, neither neural network models nor mutation strategies can be consistently effective, and the power of their gradient-guidance mechanisms have been compromised. Inspired by such findings, we propose a simplistic technique, PreFuzz, which improves neural program-smoothing-based fuzzers with a resource-efficient edge selection mechanism to enhance their gradient guidance and a probabilistic byte selection mechanism to further boost mutation effectiveness. Our evaluation results indicate that PreFuzz can significantly increase the edge coverage of Neuzz/MTFuzz, and also reveal multiple practical guidelines to advance future research on neural program-smoothing-based fuzzing.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE)

自引率

0.00%

发文量