机器学习引导的选择性不健全静态分析

2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE) Pub Date : 2017-05-20 DOI:10.1109/ICSE.2017.54

K. Heo, Hakjoo Oh, K. Yi

{"title":"机器学习引导的选择性不健全静态分析","authors":"K. Heo, Hakjoo Oh, K. Yi","doi":"10.1109/ICSE.2017.54","DOIUrl":null,"url":null,"abstract":"We present a machine-learning-based technique for selectively applying unsoundness in static analysis. Existing bug-finding static analyzers are unsound in order to be precise and scalable in practice. However, they are uniformly unsound and hence at the risk of missing a large amount of real bugs. By being sound, we can improve the detectability of the analyzer but it often suffers from a large number of false alarms. Our approach aims to strike a balance between these two approaches by selectively allowing unsoundness only when it is likely to reduce false alarms, while retaining true alarms. We use an anomaly-detection technique to learn such harmless unsoundness. We implemented our technique in two static analyzers for full C. One is for a taint analysis for detecting format-string vulnerabilities, and the other is for an interval analysis for buffer-overflow detection. The experimental results show that our approach significantly improves the recall of the original unsound analysis without sacrificing the precision.","PeriodicalId":6505,"journal":{"name":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","volume":"383 1","pages":"519-529"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"51","resultStr":"{\"title\":\"Machine-Learning-Guided Selectively Unsound Static Analysis\",\"authors\":\"K. Heo, Hakjoo Oh, K. Yi\",\"doi\":\"10.1109/ICSE.2017.54\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present a machine-learning-based technique for selectively applying unsoundness in static analysis. Existing bug-finding static analyzers are unsound in order to be precise and scalable in practice. However, they are uniformly unsound and hence at the risk of missing a large amount of real bugs. By being sound, we can improve the detectability of the analyzer but it often suffers from a large number of false alarms. Our approach aims to strike a balance between these two approaches by selectively allowing unsoundness only when it is likely to reduce false alarms, while retaining true alarms. We use an anomaly-detection technique to learn such harmless unsoundness. We implemented our technique in two static analyzers for full C. One is for a taint analysis for detecting format-string vulnerabilities, and the other is for an interval analysis for buffer-overflow detection. The experimental results show that our approach significantly improves the recall of the original unsound analysis without sacrificing the precision.\",\"PeriodicalId\":6505,\"journal\":{\"name\":\"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)\",\"volume\":\"383 1\",\"pages\":\"519-529\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-05-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"51\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSE.2017.54\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSE.2017.54","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 51

摘要

我们提出了一种基于机器学习的技术，用于选择性地在静态分析中应用不稳健性。为了在实践中精确和可扩展，现有的bug查找静态分析器是不健全的。然而，它们都是不可靠的，因此有可能遗漏大量真正的bug。通过合理的检测，可以提高分析仪的可检测性，但往往存在大量的误报。我们的方法旨在在这两种方法之间取得平衡，只有在可能减少假警报的情况下才有选择地允许不健全，同时保留真警报。我们使用异常检测技术来学习这种无害的不健全。我们在两个静态分析器中实现了我们的技术。一个是用于检测格式字符串漏洞的污染分析，另一个是用于缓冲区溢出检测的间隔分析。实验结果表明，该方法在不牺牲精度的前提下，显著提高了原始不健全分析的召回率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Machine-Learning-Guided Selectively Unsound Static Analysis

We present a machine-learning-based technique for selectively applying unsoundness in static analysis. Existing bug-finding static analyzers are unsound in order to be precise and scalable in practice. However, they are uniformly unsound and hence at the risk of missing a large amount of real bugs. By being sound, we can improve the detectability of the analyzer but it often suffers from a large number of false alarms. Our approach aims to strike a balance between these two approaches by selectively allowing unsoundness only when it is likely to reduce false alarms, while retaining true alarms. We use an anomaly-detection technique to learn such harmless unsoundness. We implemented our technique in two static analyzers for full C. One is for a taint analysis for detecting format-string vulnerabilities, and the other is for an interval analysis for buffer-overflow detection. The experimental results show that our approach significantly improves the recall of the original unsound analysis without sacrificing the precision.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)

自引率

0.00%

发文量