An empirical study on the effectiveness of static C code analyzers for vulnerability detection

Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis Pub Date : 2022-07-18 DOI:10.1145/3533767.3534380

Stephan Lipp, Sebastian Banescu, A. Pretschner

{"title":"An empirical study on the effectiveness of static C code analyzers for vulnerability detection","authors":"Stephan Lipp, Sebastian Banescu, A. Pretschner","doi":"10.1145/3533767.3534380","DOIUrl":null,"url":null,"abstract":"Static code analysis is often used to scan source code for security vulnerabilities. Given the wide range of existing solutions implementing different analysis techniques, it is very challenging to perform an objective comparison between static analysis tools to determine which ones are most effective at detecting vulnerabilities. Existing studies are thereby limited in that (1) they use synthetic datasets, whose vulnerabilities do not reflect the complexity of security bugs that can be found in practice and/or (2) they do not provide differentiated analyses w.r.t. the types of vulnerabilities output by the static analyzers. Hence, their conclusions about an analyzer's capability to detect vulnerabilities may not generalize to real-world programs. In this paper, we propose a methodology for automatically evaluating the effectiveness of static code analyzers based on CVE reports. We evaluate five free and open-source and one commercial static C code analyzer(s) against 27 software projects containing a total of 1.15 million lines of code and 192 vulnerabilities (ground truth). While static C analyzers have been shown to perform well in benchmarks with synthetic bugs, our results indicate that state-of-the-art tools miss in-between 47% and 80% of the vulnerabilities in a benchmark set of real-world programs. Moreover, our study finds that this false negative rate can be reduced to 30% to 69% when combining the results of static analyzers, at the cost of 15 percentage points more functions flagged. Many vulnerabilities hence remain undetected, especially those beyond the classical memory-related security bugs.","PeriodicalId":412271,"journal":{"name":"Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"64 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"23","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3533767.3534380","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 23

Abstract

Static code analysis is often used to scan source code for security vulnerabilities. Given the wide range of existing solutions implementing different analysis techniques, it is very challenging to perform an objective comparison between static analysis tools to determine which ones are most effective at detecting vulnerabilities. Existing studies are thereby limited in that (1) they use synthetic datasets, whose vulnerabilities do not reflect the complexity of security bugs that can be found in practice and/or (2) they do not provide differentiated analyses w.r.t. the types of vulnerabilities output by the static analyzers. Hence, their conclusions about an analyzer's capability to detect vulnerabilities may not generalize to real-world programs. In this paper, we propose a methodology for automatically evaluating the effectiveness of static code analyzers based on CVE reports. We evaluate five free and open-source and one commercial static C code analyzer(s) against 27 software projects containing a total of 1.15 million lines of code and 192 vulnerabilities (ground truth). While static C analyzers have been shown to perform well in benchmarks with synthetic bugs, our results indicate that state-of-the-art tools miss in-between 47% and 80% of the vulnerabilities in a benchmark set of real-world programs. Moreover, our study finds that this false negative rate can be reduced to 30% to 69% when combining the results of static analyzers, at the cost of 15 percentage points more functions flagged. Many vulnerabilities hence remain undetected, especially those beyond the classical memory-related security bugs.

查看原文本刊更多论文

静态C代码分析器用于漏洞检测有效性的实证研究

静态代码分析通常用于扫描源代码以查找安全漏洞。鉴于实现不同分析技术的现有解决方案范围广泛，在静态分析工具之间执行客观比较以确定哪一个在检测漏洞方面最有效是非常具有挑战性的。因此，现有研究的局限性在于:(1)它们使用合成数据集，其漏洞不能反映在实践中可以发现的安全漏洞的复杂性和/或(2)它们没有提供区分分析，而不是静态分析器输出的漏洞类型。因此，他们关于分析器检测漏洞的能力的结论可能不适用于现实世界的程序。在本文中，我们提出了一种基于CVE报告自动评估静态代码分析器有效性的方法。我们针对27个软件项目评估了5个免费开源和1个商业静态C代码分析器，这些项目总共包含115万行代码和192个漏洞(基本事实)。虽然静态C分析程序在带有合成错误的基准测试中表现良好，但我们的结果表明，最先进的工具在现实世界程序的基准测试集中遗漏了47%到80%的漏洞。此外，我们的研究发现，当结合静态分析器的结果时，这种假阴性率可以降低到30%到69%，代价是多标记了15个百分点的功能。因此，许多漏洞仍然未被发现，特别是那些超出经典内存相关安全错误的漏洞。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis

自引率

0.00%

发文量