{"title":"Vulmg: A Static Detection Solution For Source Code Vulnerabilities Based On Code Property Graph and Graph Attention Network","authors":"Zhang Haojie, Liao Yujun, Liu Yiwei, Zhou Nanxin","doi":"10.1109/ICCWAMTIP53232.2021.9674145","DOIUrl":null,"url":null,"abstract":"As the number of vulnerabilities continues to rise, security incidents triggered by vulnerabilities emerge endlessly. Current vulnerability detection methods still have some problems, such as detecting only a single function, relying heavily on expert knowledge, and being unable to achieve automation. According to the observation of the Juliet dataset, we find vulnerability exists not only within the single function but also between the called function and the calling function. Meanwhile, there are some differences between vulnerable functions and non-vulnerable functions in the code property graph. Therefore, this article proposes a vulnerability detection solution named VULMG, which converts vulnerability detection into the graph classification problem. VULMG includes a vectorization component named VecG and a deep learning classification model named MGGAT. Based on the code property graph, VecG extracts the lexical, grammatical, and semantic information of the source code as a feature matrix and extracts information such as structure, control, and dependence as three adjacency matrices. MGGAT is a deep learning model based on the graph attention network, which is used for graph classification. Besides, VULMG uses the FCG to associate the calling function with the called function so that it can detect the cross-function vulnerabilities. We selected CWE369 and CWE476 from the Juliet dataset for testing, and the F1 scores were 94.43% and 96.3%. The evaluation results indicate that VULMG outperforms Flawfinder, RATS, BiLSTM, SVM, and GCN, which verifies the effectiveness of the proposed solution.","PeriodicalId":358772,"journal":{"name":"2021 18th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP)","volume":"62 12","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 18th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCWAMTIP53232.2021.9674145","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
As the number of vulnerabilities continues to rise, security incidents triggered by vulnerabilities emerge endlessly. Current vulnerability detection methods still have some problems, such as detecting only a single function, relying heavily on expert knowledge, and being unable to achieve automation. According to the observation of the Juliet dataset, we find vulnerability exists not only within the single function but also between the called function and the calling function. Meanwhile, there are some differences between vulnerable functions and non-vulnerable functions in the code property graph. Therefore, this article proposes a vulnerability detection solution named VULMG, which converts vulnerability detection into the graph classification problem. VULMG includes a vectorization component named VecG and a deep learning classification model named MGGAT. Based on the code property graph, VecG extracts the lexical, grammatical, and semantic information of the source code as a feature matrix and extracts information such as structure, control, and dependence as three adjacency matrices. MGGAT is a deep learning model based on the graph attention network, which is used for graph classification. Besides, VULMG uses the FCG to associate the calling function with the called function so that it can detect the cross-function vulnerabilities. We selected CWE369 and CWE476 from the Juliet dataset for testing, and the F1 scores were 94.43% and 96.3%. The evaluation results indicate that VULMG outperforms Flawfinder, RATS, BiLSTM, SVM, and GCN, which verifies the effectiveness of the proposed solution.