{"title":"On establishing a benchmark for evaluating static analysis alert prioritization and classification techniques","authors":"S. Heckman, L. Williams","doi":"10.1145/1414004.1414013","DOIUrl":null,"url":null,"abstract":"Benchmarks provide an experimental basis for evaluating software engineering processes or techniques in an objective and repeatable manner. We present the FAULTBENCH v0.1 benchmark, as a contribution to current benchmark materials, for evaluation and comparison of techniques that prioritize and classify alerts generated by static analysis tools. Static analysis tools may generate an overwhelming number of alerts, the majority of which are likely to be false positives (FP). Two FP mitigation techniques, alert prioritization and classification, provide an ordering or classification of alerts, identifying those likely to be anomalies. We evaluate FAULTBENCH using three versions of a FP mitigation technique within the AWARE adaptive prioritization model. Individual FAULTBENCH subjects vary in their optimal FP mitigation techniques. Together, FAULTBENCH subjects provide a precise and general evaluation of FP mitigation techniques.","PeriodicalId":124452,"journal":{"name":"International Symposium on Empirical Software Engineering and Measurement","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"110","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Symposium on Empirical Software Engineering and Measurement","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1414004.1414013","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 110
Abstract
Benchmarks provide an experimental basis for evaluating software engineering processes or techniques in an objective and repeatable manner. We present the FAULTBENCH v0.1 benchmark, as a contribution to current benchmark materials, for evaluation and comparison of techniques that prioritize and classify alerts generated by static analysis tools. Static analysis tools may generate an overwhelming number of alerts, the majority of which are likely to be false positives (FP). Two FP mitigation techniques, alert prioritization and classification, provide an ordering or classification of alerts, identifying those likely to be anomalies. We evaluate FAULTBENCH using three versions of a FP mitigation technique within the AWARE adaptive prioritization model. Individual FAULTBENCH subjects vary in their optimal FP mitigation techniques. Together, FAULTBENCH subjects provide a precise and general evaluation of FP mitigation techniques.