Finding patterns in static analysis alerts: improving actionable alert ranking

2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR) Pub Date : 2014-05-31 DOI:10.1145/2597073.2597100

Quinn Hanam, Lin Tan, Reid Holmes, Patrick Lam

{"title":"Finding patterns in static analysis alerts: improving actionable alert ranking","authors":"Quinn Hanam, Lin Tan, Reid Holmes, Patrick Lam","doi":"10.1145/2597073.2597100","DOIUrl":null,"url":null,"abstract":"Static analysis (SA) tools that find bugs by inferring programmer beliefs (e.g., FindBugs) are commonplace in today's software industry. While they find a large number of actual defects, they are often plagued by high rates of alerts that a developer would not act on (unactionable alerts) because they are incorrect, do not significantly affect program execution, etc. High rates of unactionable alerts decrease the utility of static analysis tools in practice. \n We present a method for differentiating actionable and unactionable alerts by finding alerts with similar code patterns. To do so, we create a feature vector based on code characteristics at the site of each SA alert. With these feature vectors, we use machine learning techniques to build an actionable alert prediction model that is able to classify new SA alerts. \n We evaluate our technique on three subject programs using the FindBugs static analysis tool and the Faultbench benchmark methodology. For a developer inspecting the top 5% of all alerts for three sample projects, our approach is able to identify 57 of 211 actionable alerts, which is 38 more than the FindBugs priority measure. Combined with previous actionable alert identification techniques, our method finds 75 actionable alerts in the top 5%, which is four more actionable alerts (a 6% improvement) than previous actionable alert identification techniques.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"1 1","pages":"152-161"},"PeriodicalIF":0.0000,"publicationDate":"2014-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"68","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2597073.2597100","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 68

Abstract

Static analysis (SA) tools that find bugs by inferring programmer beliefs (e.g., FindBugs) are commonplace in today's software industry. While they find a large number of actual defects, they are often plagued by high rates of alerts that a developer would not act on (unactionable alerts) because they are incorrect, do not significantly affect program execution, etc. High rates of unactionable alerts decrease the utility of static analysis tools in practice. We present a method for differentiating actionable and unactionable alerts by finding alerts with similar code patterns. To do so, we create a feature vector based on code characteristics at the site of each SA alert. With these feature vectors, we use machine learning techniques to build an actionable alert prediction model that is able to classify new SA alerts. We evaluate our technique on three subject programs using the FindBugs static analysis tool and the Faultbench benchmark methodology. For a developer inspecting the top 5% of all alerts for three sample projects, our approach is able to identify 57 of 211 actionable alerts, which is 38 more than the FindBugs priority measure. Combined with previous actionable alert identification techniques, our method finds 75 actionable alerts in the top 5%, which is four more actionable alerts (a 6% improvement) than previous actionable alert identification techniques.

查看原文本刊更多论文

在静态分析警报中发现模式:改进可操作警报排名

通过推断程序员的想法(例如FindBugs)来发现bug的静态分析(SA)工具在今天的软件行业中很常见。虽然他们发现了大量的实际缺陷，但他们经常受到开发人员不采取行动的高比率警报(不可操作警报)的困扰，因为它们是不正确的，不会显著影响程序执行，等等。不可操作警报的高比率降低了静态分析工具在实践中的实用性。我们提出了一种通过寻找具有相似代码模式的警报来区分可操作警报和不可操作警报的方法。为此，我们基于每个SA警报所在位置的代码特征创建一个特征向量。利用这些特征向量，我们使用机器学习技术来构建一个可操作的警报预测模型，该模型能够对新的SA警报进行分类。我们使用FindBugs静态分析工具和Faultbench基准测试方法在三个主题程序上评估了我们的技术。对于检查三个示例项目中所有警报的前5%的开发人员，我们的方法能够识别211个可操作警报中的57个，比FindBugs优先级度量多38个。结合以前的可操作警报识别技术，我们的方法在前5%中发现了75个可操作警报，比以前的可操作警报识别技术多了4个可操作警报(提高了6%)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)

自引率

0.00%

发文量