Fuzz Testing the Compiled Code in R Packages

2021 IEEE 32nd International Symposium on Software Reliability Engineering (ISSRE) Pub Date : 2021-10-01 DOI:10.1109/ISSRE52982.2021.00040

Akhila Chowdary Kolla, Alex Groce, T. Hocking

{"title":"Fuzz Testing the Compiled Code in R Packages","authors":"Akhila Chowdary Kolla, Alex Groce, T. Hocking","doi":"10.1109/ISSRE52982.2021.00040","DOIUrl":null,"url":null,"abstract":"R packages written in the widely used Rcpp frame-work are typically tested using expected input/output pairs that are manually coded by package developers. These manually written tests are validated under various CRAN checks, using both static and dynamic analysis. Such manually written tests allow for subtle bugs, since they do not anticipate all possible inputs and miss important code paths. Fuzzers pass random, unexpected, potentially invalid inputs to a function, in order to identify bugs missed by manually written tests. This paper presents RcppDeepState, an R package that uses the DeepState framework to provide automatic fuzzing and symbolic execution for $R$ packages written using the Rcpp framework. Using RcppDeepState, a package developer can systematically fuzz test their Rcpp functions, without having to manually write any inputs nor expected outputs. Randomly generated inputs are passed to each Rcpp function, and Valgrind is used to check for various memory access violations and memory leaks. In our system, a test harness can be used to fuzz test an Rcpp function using different backend fuzzers including afl, libFuzzer, and HonggFuzz. For even more flexibility, $R$ package developers can write their own random generation functions and assertions. We implemented random generation functions for 8 of the most common Rcpp data types, then used these functions to fuzz test 1,185 Rcpp packages. Valgrind reported issues for more than 2,000 functions (over nearly 500 packages) which were not detected using standard CRAN checks on manually specified test/example inputs. Developers confirmed for several of these issues that the problem was reproducible and represented missing or flawed code. These results suggest that RcppDeepState is useful for finding subtle flaws in Rcpp packages.","PeriodicalId":162410,"journal":{"name":"2021 IEEE 32nd International Symposium on Software Reliability Engineering (ISSRE)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 32nd International Symposium on Software Reliability Engineering (ISSRE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISSRE52982.2021.00040","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

R packages written in the widely used Rcpp frame-work are typically tested using expected input/output pairs that are manually coded by package developers. These manually written tests are validated under various CRAN checks, using both static and dynamic analysis. Such manually written tests allow for subtle bugs, since they do not anticipate all possible inputs and miss important code paths. Fuzzers pass random, unexpected, potentially invalid inputs to a function, in order to identify bugs missed by manually written tests. This paper presents RcppDeepState, an R package that uses the DeepState framework to provide automatic fuzzing and symbolic execution for $R$ packages written using the Rcpp framework. Using RcppDeepState, a package developer can systematically fuzz test their Rcpp functions, without having to manually write any inputs nor expected outputs. Randomly generated inputs are passed to each Rcpp function, and Valgrind is used to check for various memory access violations and memory leaks. In our system, a test harness can be used to fuzz test an Rcpp function using different backend fuzzers including afl, libFuzzer, and HonggFuzz. For even more flexibility, $R$ package developers can write their own random generation functions and assertions. We implemented random generation functions for 8 of the most common Rcpp data types, then used these functions to fuzz test 1,185 Rcpp packages. Valgrind reported issues for more than 2,000 functions (over nearly 500 packages) which were not detected using standard CRAN checks on manually specified test/example inputs. Developers confirmed for several of these issues that the problem was reproducible and represented missing or flawed code. These results suggest that RcppDeepState is useful for finding subtle flaws in Rcpp packages.

查看原文本刊更多论文

模糊测试R包中的编译代码

使用广泛使用的Rcpp框架编写的R包通常使用预期的输入/输出对进行测试，这些输入/输出对由包开发人员手动编码。使用静态和动态分析，在各种CRAN检查下验证这些手动编写的测试。这种手工编写的测试允许出现细微的错误，因为它们没有预测到所有可能的输入，并错过了重要的代码路径。Fuzzers将随机的、意外的、可能无效的输入传递给函数，以便识别手工编写的测试遗漏的bug。本文介绍了RcppDeepState，这是一个R包，它使用DeepState框架为使用Rcpp框架编写的$R$包提供自动模糊测试和符号执行。使用RcppDeepState，包开发人员可以系统地模糊测试他们的Rcpp功能，而无需手动编写任何输入或预期的输出。随机生成的输入被传递给每个Rcpp函数，Valgrind用于检查各种内存访问违规和内存泄漏。在我们的系统中，测试工具可以使用不同的后端模糊器(包括afl, libFuzzer和HonggFuzz)对Rcpp功能进行模糊测试。为了获得更大的灵活性，$R$包开发人员可以编写自己的随机生成函数和断言。我们为8种最常见的Rcpp数据类型实现了随机生成函数，然后使用这些函数对1185个Rcpp包进行模糊测试。Valgrind报告了超过2000个函数(超过500个包)的问题，这些问题在手动指定的测试/示例输入上使用标准CRAN检查时没有检测到。开发人员对其中几个问题进行了确认，这些问题是可重现的，代表了缺失或有缺陷的代码。这些结果表明，RcppDeepState对于发现Rcpp包中的细微缺陷很有用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 IEEE 32nd International Symposium on Software Reliability Engineering (ISSRE)

自引率

0.00%

发文量