Kien-Tuan Ngo, Dinh-Truong Do, Thu-Trang Nguyen, H. Vo
{"title":"使用表示学习对静态分析工具的警告进行排序","authors":"Kien-Tuan Ngo, Dinh-Truong Do, Thu-Trang Nguyen, H. Vo","doi":"10.1109/APSEC53868.2021.00040","DOIUrl":null,"url":null,"abstract":"Static analysis tools are frequently used to detect potential vulnerabilities in software systems. However, an inevitable problem of these tools is their large number of warnings with a high false positive rate, which consumes time and effort for investigating. In this paper, we present DEFP, a novel method for ranking static analysis warnings. Based on the intuition that warnings which have similar contexts tend to have similar labels (true positive or false positive), DEFP is built with two BiLSTM models to capture the patterns associated with the contexts of labeled warnings. After that, for a set of new warnings, DEFP can calculate and rank them according to their likelihoods to be true positives (i.e., actual vulnerabilities). Our experimental results on a dataset of 10 real-world projects show that using DEFP, by investigating only 60% of the warnings, developers can find +90% of actual vulnerabilities. Moreover, DEFP improves the state-of-the-art approach 30% in both Precision and Recall.","PeriodicalId":143800,"journal":{"name":"2021 28th Asia-Pacific Software Engineering Conference (APSEC)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2021-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Ranking Warnings of Static Analysis Tools Using Representation Learning\",\"authors\":\"Kien-Tuan Ngo, Dinh-Truong Do, Thu-Trang Nguyen, H. Vo\",\"doi\":\"10.1109/APSEC53868.2021.00040\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Static analysis tools are frequently used to detect potential vulnerabilities in software systems. However, an inevitable problem of these tools is their large number of warnings with a high false positive rate, which consumes time and effort for investigating. In this paper, we present DEFP, a novel method for ranking static analysis warnings. Based on the intuition that warnings which have similar contexts tend to have similar labels (true positive or false positive), DEFP is built with two BiLSTM models to capture the patterns associated with the contexts of labeled warnings. After that, for a set of new warnings, DEFP can calculate and rank them according to their likelihoods to be true positives (i.e., actual vulnerabilities). Our experimental results on a dataset of 10 real-world projects show that using DEFP, by investigating only 60% of the warnings, developers can find +90% of actual vulnerabilities. Moreover, DEFP improves the state-of-the-art approach 30% in both Precision and Recall.\",\"PeriodicalId\":143800,\"journal\":{\"name\":\"2021 28th Asia-Pacific Software Engineering Conference (APSEC)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 28th Asia-Pacific Software Engineering Conference (APSEC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/APSEC53868.2021.00040\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 28th Asia-Pacific Software Engineering Conference (APSEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/APSEC53868.2021.00040","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Ranking Warnings of Static Analysis Tools Using Representation Learning
Static analysis tools are frequently used to detect potential vulnerabilities in software systems. However, an inevitable problem of these tools is their large number of warnings with a high false positive rate, which consumes time and effort for investigating. In this paper, we present DEFP, a novel method for ranking static analysis warnings. Based on the intuition that warnings which have similar contexts tend to have similar labels (true positive or false positive), DEFP is built with two BiLSTM models to capture the patterns associated with the contexts of labeled warnings. After that, for a set of new warnings, DEFP can calculate and rank them according to their likelihoods to be true positives (i.e., actual vulnerabilities). Our experimental results on a dataset of 10 real-world projects show that using DEFP, by investigating only 60% of the warnings, developers can find +90% of actual vulnerabilities. Moreover, DEFP improves the state-of-the-art approach 30% in both Precision and Recall.