DocTer: documentation-guided fuzzing for testing deep learning API functions

Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis Pub Date : 2021-09-02 DOI:10.1145/3533767.3534220

Danning Xie, Yitong Li, Mijung Kim, H. Pham, Lin Tan, X. Zhang, Michael W. Godfrey

{"title":"DocTer: documentation-guided fuzzing for testing deep learning API functions","authors":"Danning Xie, Yitong Li, Mijung Kim, H. Pham, Lin Tan, X. Zhang, Michael W. Godfrey","doi":"10.1145/3533767.3534220","DOIUrl":null,"url":null,"abstract":"Input constraints are useful for many software development tasks. For example, input constraints of a function enable the generation of valid inputs, i.e., inputs that follow these constraints, to test the function deeper. API functions of deep learning (DL) libraries have DL-specific input constraints, which are described informally in the free-form API documentation. Existing constraint-extraction techniques are ineffective for extracting DL-specific input constraints. To fill this gap, we design and implement a new technique—DocTer—to analyze API documentation to extract DL-specific input constraints for DL API functions. DocTer features a novel algorithm that automatically constructs rules to extract API parameter constraints from syntactic patterns in the form of dependency parse trees of API descriptions. These rules are then applied to a large volume of API documents in popular DL libraries to extract their input parameter constraints. To demonstrate the effectiveness of the extracted constraints, DocTer uses the constraints to enable the automatic generation of valid and invalid inputs to test DL API functions. Our evaluation on three popular DL libraries (TensorFlow, PyTorch, and MXNet) shows that DocTer’s precision in extracting input constraints is 85.4%. DocTer detects 94 bugs from 174 API functions, including one previously unknown security vulnerability that is now documented in the CVE database, while a baseline technique without input constraints detects only 59 bugs. Most (63) of the 94 bugs are previously unknown, 54 of which have been fixed or confirmed by developers after we report them. In addition, DocTer detects 43 inconsistencies in documents, 39 of which are fixed or confirmed.","PeriodicalId":412271,"journal":{"name":"Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"46 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"33","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3533767.3534220","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 33

Abstract

Input constraints are useful for many software development tasks. For example, input constraints of a function enable the generation of valid inputs, i.e., inputs that follow these constraints, to test the function deeper. API functions of deep learning (DL) libraries have DL-specific input constraints, which are described informally in the free-form API documentation. Existing constraint-extraction techniques are ineffective for extracting DL-specific input constraints. To fill this gap, we design and implement a new technique—DocTer—to analyze API documentation to extract DL-specific input constraints for DL API functions. DocTer features a novel algorithm that automatically constructs rules to extract API parameter constraints from syntactic patterns in the form of dependency parse trees of API descriptions. These rules are then applied to a large volume of API documents in popular DL libraries to extract their input parameter constraints. To demonstrate the effectiveness of the extracted constraints, DocTer uses the constraints to enable the automatic generation of valid and invalid inputs to test DL API functions. Our evaluation on three popular DL libraries (TensorFlow, PyTorch, and MXNet) shows that DocTer’s precision in extracting input constraints is 85.4%. DocTer detects 94 bugs from 174 API functions, including one previously unknown security vulnerability that is now documented in the CVE database, while a baseline technique without input constraints detects only 59 bugs. Most (63) of the 94 bugs are previously unknown, 54 of which have been fixed or confirmed by developers after we report them. In addition, DocTer detects 43 inconsistencies in documents, 39 of which are fixed or confirmed.

查看原文本刊更多论文

文档指导的模糊测试，用于测试深度学习API功能

输入约束对于许多软件开发任务都很有用。例如，函数的输入约束允许生成有效的输入，即，遵循这些约束的输入，以更深入地测试函数。深度学习(DL)库的API函数具有特定于DL的输入约束，这些约束在自由格式的API文档中进行了非正式描述。现有的约束提取技术对于提取dl特定的输入约束是无效的。为了填补这一空白，我们设计并实现了一种新技术——docter——来分析API文档，为DL API函数提取DL特定的输入约束。DocTer采用了一种新颖的算法，该算法自动构建规则，以API描述的依赖解析树的形式从语法模式中提取API参数约束。然后将这些规则应用于流行DL库中的大量API文档，以提取其输入参数约束。为了证明提取约束的有效性，DocTer使用这些约束来自动生成有效和无效的输入，以测试DL API功能。我们对三个流行的深度学习库(TensorFlow, PyTorch和MXNet)的评估表明，DocTer在提取输入约束方面的精度为85.4%。DocTer从174个API函数中检测出94个错误，包括一个以前未知的安全漏洞，现在已记录在CVE数据库中，而没有输入限制的基线技术仅检测到59个错误。94个错误中的大多数(63个)以前是未知的，其中54个在我们报告后已经被开发人员修复或确认。此外，DocTer还发现了43个文件不一致的地方，其中39个是固定或确认的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis

自引率

0.00%

发文量