{"title":"为支持可解释的静态应用程序安全测试工具的评估构建基准","authors":"Gaojian Hao, Feng Li, Wei Huo, Qing Sun, Wei Wang, Xinhua Li, Wei Zou","doi":"10.1109/TASE.2019.00-18","DOIUrl":null,"url":null,"abstract":"When evaluating Static Application Security Testing (SAST) tools, benchmarks based on real-world softwares are considered more representative than synthetic micro benchmarks. Generated from real-world software, the test cases in such kind of benchmarks usually contain multiple syntactic features which affect the vulnerability detection results reflecting SAST tools' capabilities in real-world settings. However, most existing benchmarks based on real-world software pay little attention to these syntactic features so that only limited information about the capabilities of SAST tools can be obtained from the evaluation results. In this paper, we provide a method of constructing benchmarks and evaluating SAST tools, which leverages the syntactic features to support the evaluation to be more explainable. To demonstrate the effectiveness, we applied our method to the benchmark built by Misha Zitser et al., generated 10 groups of test cases, and evaluated 2 SAST tools with them. The result shows that, with the benchmark constructed by our method, the evaluation could be more explainable which helps us to gain more information about the SAST tools' capabilities of vulnerability detection.","PeriodicalId":183749,"journal":{"name":"2019 International Symposium on Theoretical Aspects of Software Engineering (TASE)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Constructing Benchmarks for Supporting Explainable Evaluations of Static Application Security Testing Tools\",\"authors\":\"Gaojian Hao, Feng Li, Wei Huo, Qing Sun, Wei Wang, Xinhua Li, Wei Zou\",\"doi\":\"10.1109/TASE.2019.00-18\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"When evaluating Static Application Security Testing (SAST) tools, benchmarks based on real-world softwares are considered more representative than synthetic micro benchmarks. Generated from real-world software, the test cases in such kind of benchmarks usually contain multiple syntactic features which affect the vulnerability detection results reflecting SAST tools' capabilities in real-world settings. However, most existing benchmarks based on real-world software pay little attention to these syntactic features so that only limited information about the capabilities of SAST tools can be obtained from the evaluation results. In this paper, we provide a method of constructing benchmarks and evaluating SAST tools, which leverages the syntactic features to support the evaluation to be more explainable. To demonstrate the effectiveness, we applied our method to the benchmark built by Misha Zitser et al., generated 10 groups of test cases, and evaluated 2 SAST tools with them. The result shows that, with the benchmark constructed by our method, the evaluation could be more explainable which helps us to gain more information about the SAST tools' capabilities of vulnerability detection.\",\"PeriodicalId\":183749,\"journal\":{\"name\":\"2019 International Symposium on Theoretical Aspects of Software Engineering (TASE)\",\"volume\":\"44 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 International Symposium on Theoretical Aspects of Software Engineering (TASE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/TASE.2019.00-18\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Symposium on Theoretical Aspects of Software Engineering (TASE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TASE.2019.00-18","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Constructing Benchmarks for Supporting Explainable Evaluations of Static Application Security Testing Tools
When evaluating Static Application Security Testing (SAST) tools, benchmarks based on real-world softwares are considered more representative than synthetic micro benchmarks. Generated from real-world software, the test cases in such kind of benchmarks usually contain multiple syntactic features which affect the vulnerability detection results reflecting SAST tools' capabilities in real-world settings. However, most existing benchmarks based on real-world software pay little attention to these syntactic features so that only limited information about the capabilities of SAST tools can be obtained from the evaluation results. In this paper, we provide a method of constructing benchmarks and evaluating SAST tools, which leverages the syntactic features to support the evaluation to be more explainable. To demonstrate the effectiveness, we applied our method to the benchmark built by Misha Zitser et al., generated 10 groups of test cases, and evaluated 2 SAST tools with them. The result shows that, with the benchmark constructed by our method, the evaluation could be more explainable which helps us to gain more information about the SAST tools' capabilities of vulnerability detection.