Reachable Coverage: Estimating Saturation in Fuzzing

2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE) Pub Date : 2023-05-01 DOI:10.1109/ICSE48619.2023.00042

D. Liyanage, Marcel Böhme, C. Tantithamthavorn, Stephan Lipp

{"title":"Reachable Coverage: Estimating Saturation in Fuzzing","authors":"D. Liyanage, Marcel Böhme, C. Tantithamthavorn, Stephan Lipp","doi":"10.1109/ICSE48619.2023.00042","DOIUrl":null,"url":null,"abstract":"Reachable coverage is the number of code elements in the search space of a fuzzer (i.e., an automatic software testing tool). A fuzzer cannot find bugs in code that is unreachable. Hence, reachable coverage quantifies fuzzer effectiveness. Using static program analysis, we can compute an upper bound on the number of reachable coverage elements, e.g., by extracting the call graph. However, we cannot decide whether a coverage element is reachable in general. If we could precisely determine reachable coverage efficiently, we would have solved the software verification problem. Unfortunately, we cannot approach a given degree of accuracy for the static approximation, either. In this paper, we advocate a statistical perspective on the approximation of the number of elements in the fuzzer's search space, where accuracy does improve as a function of the analysis runtime. In applied statistics, corresponding estimators have been developed and well established for more than a quarter century. These estimators hold an exciting promise to finally tackle the long-standing challenge of counting reachability. In this paper, we explore the utility of these estimators in the context of fuzzing. Estimates of reachable coverage can be used to measure (a) the amount of untested code, (b) the effectiveness of the testing technique, and (c) the completeness of the ongoing fuzzing campaign (w.r.t. the asymptotic max. achievable coverage). We make all data and our analysis publicly available.","PeriodicalId":376379,"journal":{"name":"2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSE48619.2023.00042","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

Reachable coverage is the number of code elements in the search space of a fuzzer (i.e., an automatic software testing tool). A fuzzer cannot find bugs in code that is unreachable. Hence, reachable coverage quantifies fuzzer effectiveness. Using static program analysis, we can compute an upper bound on the number of reachable coverage elements, e.g., by extracting the call graph. However, we cannot decide whether a coverage element is reachable in general. If we could precisely determine reachable coverage efficiently, we would have solved the software verification problem. Unfortunately, we cannot approach a given degree of accuracy for the static approximation, either. In this paper, we advocate a statistical perspective on the approximation of the number of elements in the fuzzer's search space, where accuracy does improve as a function of the analysis runtime. In applied statistics, corresponding estimators have been developed and well established for more than a quarter century. These estimators hold an exciting promise to finally tackle the long-standing challenge of counting reachability. In this paper, we explore the utility of these estimators in the context of fuzzing. Estimates of reachable coverage can be used to measure (a) the amount of untested code, (b) the effectiveness of the testing technique, and (c) the completeness of the ongoing fuzzing campaign (w.r.t. the asymptotic max. achievable coverage). We make all data and our analysis publicly available.

查看原文本刊更多论文

可达覆盖:模糊测试中饱和度的估计

可达覆盖率是在一个fuzzer(例如，一个自动软件测试工具)的搜索空间中的代码元素的数量。模糊器无法在无法到达的代码中找到bug。因此，可达覆盖率量化了模糊器的有效性。使用静态程序分析，我们可以计算可达覆盖元素数量的上限，例如，通过提取调用图。然而，我们不能决定覆盖元素是否在一般情况下是可到达的。如果我们能够有效地精确地确定可达覆盖率，我们就解决了软件验证问题。不幸的是，对于静态近似，我们也不能达到给定的精度。在本文中，我们提倡对模糊器搜索空间中元素数量近似值的统计观点，其中准确性作为分析运行时的函数确实有所提高。在应用统计学中，相应的估计器已经发展和完善了超过四分之一个世纪。这些估算器有望最终解决长期存在的计数可达性挑战。在本文中，我们探讨了这些估计量在模糊环境中的应用。可达覆盖率的估计可以用来度量(a)未测试代码的数量，(b)测试技术的有效性，以及(c)正在进行的模糊测试活动的完整性(w.r.t.渐近最大值)。可实现的范围)。我们把所有的数据和分析都公开。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE)

自引率

0.00%

发文量