Automated Customized Bug-Benchmark Generation

2019 19th International Working Conference on Source Code Analysis and Manipulation (SCAM) Pub Date : 2019-01-09 DOI:10.1109/SCAM.2019.00020

Vineeth Kashyap, Jason Ruchti, Lucja Kot, Emma Turetsky, R. Swords, David Melski, Eric Schulte

{"title":"Automated Customized Bug-Benchmark Generation","authors":"Vineeth Kashyap, Jason Ruchti, Lucja Kot, Emma Turetsky, R. Swords, David Melski, Eric Schulte","doi":"10.1109/SCAM.2019.00020","DOIUrl":null,"url":null,"abstract":"We introduce Bug-Injector, a system that automatically creates benchmarks for customized evaluation of static analysis tools. We share a benchmark generated using Bug-Injector and illustrate its efficacy by using it to evaluate the recall of two leading open-source static analysis tools: Clang Static Analyzer and Infer. Bug-Injector works by inserting bugs based on bug templates into real-world host programs. It runs tests on the host program to collect dynamic traces, searches the traces for a point where the state satisfies the preconditions for some bug template, then modifies the host program to \"inject\" a bug based on that template. Injected bugs are used as test cases in a static analysis tool evaluation benchmark. Every test case is accompanied by a program input that exercises the injected bug. We have identified a broad range of requirements and desiderata for bug benchmarks; our approach generates on-demand test benchmarks that meet these requirements. It also allows us to create customized benchmarks suitable for evaluating tools for a specific use case (e.g., a given codebase and set of bug types). Our experimental evaluation demonstrates the suitability of our generated benchmark for evaluating static bug-detection tools and for comparing the performance of different tools.","PeriodicalId":431316,"journal":{"name":"2019 19th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"80 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 19th International Working Conference on Source Code Analysis and Manipulation (SCAM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SCAM.2019.00020","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 14

Abstract

We introduce Bug-Injector, a system that automatically creates benchmarks for customized evaluation of static analysis tools. We share a benchmark generated using Bug-Injector and illustrate its efficacy by using it to evaluate the recall of two leading open-source static analysis tools: Clang Static Analyzer and Infer. Bug-Injector works by inserting bugs based on bug templates into real-world host programs. It runs tests on the host program to collect dynamic traces, searches the traces for a point where the state satisfies the preconditions for some bug template, then modifies the host program to "inject" a bug based on that template. Injected bugs are used as test cases in a static analysis tool evaluation benchmark. Every test case is accompanied by a program input that exercises the injected bug. We have identified a broad range of requirements and desiderata for bug benchmarks; our approach generates on-demand test benchmarks that meet these requirements. It also allows us to create customized benchmarks suitable for evaluating tools for a specific use case (e.g., a given codebase and set of bug types). Our experimental evaluation demonstrates the suitability of our generated benchmark for evaluating static bug-detection tools and for comparing the performance of different tools.

查看原文本刊更多论文

自动定制的bug基准生成

我们介绍Bug-Injector，这是一个自动创建基准的系统，用于自定义静态分析工具的评估。我们分享了一个使用Bug-Injector生成的基准，并通过使用它来评估两个领先的开源静态分析工具Clang static Analyzer和Infer的召回来说明它的有效性。bug - injector的工作原理是将基于bug模板的bug插入到真实的宿主程序中。它在主程序上运行测试以收集动态跟踪，在跟踪中搜索状态满足某些错误模板先决条件的点，然后修改主程序以基于该模板“注入”错误。注入的错误被用作静态分析工具评估基准中的测试用例。每个测试用例都伴随着一个执行注入错误的程序输入。我们已经确定了bug基准测试的广泛需求和期望;我们的方法生成满足这些需求的按需测试基准。它还允许我们创建适合于评估特定用例工具的定制基准(例如，给定的代码库和一组错误类型)。我们的实验评估证明了我们生成的基准对于评估静态bug检测工具和比较不同工具的性能的适用性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2019 19th International Working Conference on Source Code Analysis and Manipulation (SCAM)

自引率

0.00%

发文量