Meng Wang, Xiao Han, Hong Zhang, Yiran Guo, Jiangfan Guo
{"title":"DC-GAR: detecting vulnerabilities by utilizing graph properties and random walks to uncover richer features","authors":"Meng Wang, Xiao Han, Hong Zhang, Yiran Guo, Jiangfan Guo","doi":"10.1007/s10515-025-00532-6","DOIUrl":null,"url":null,"abstract":"<div><p>Deep learning has become prominent in source code vulnerability detection due to its ability to automatically extract complex feature representations from code, eliminating the need for manually defined rules or patterns. Some methods treat code as text sequences, however, they often overlook its inherent structural information. In contrast, graph-based approaches effectively capture structural relationships, but the sparseness and inconsistency of structures may lead to uneven feature vector extraction, which means that the model may not be able to adequately characterize important nodes or paths. To address this issue, we propose an approach called <b>D</b>ual-<b>c</b>hannel Graph Neural Network combining <b>G</b>raph properties <b>a</b>nd <b>R</b>andom walks (<b>DC-GAR</b>). This approach integrates graph properties and random walks within a dual-channel graph neural network framework to enhance vulnerability detection. Specifically, graph properties capture global semantic features, while random walks provide context-dependent node structure information. The combination of these features is then leveraged by the dual-channel graph neural network for detection and classification. We have implemented DC-GAR and evaluated it on a dataset of 29,514 functions. Experimental results demonstrate that DC-GAR surpasses state-of-the-art vulnerability detectors, including <i>FlawFinder</i>, <i>SySeVR</i>, <i>Devign</i>, <i>VulCNN</i>, <i>AMPLE</i>, <i>HardVD</i>, <i>CodeBERT</i>, and <i>GraphCodeBERT</i> in terms of accuracy and F1-Score. Moreover, DC-GAR has proven effective and practical in real-world open-source projects.</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"32 2","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2025-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Automated Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10515-025-00532-6","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
Deep learning has become prominent in source code vulnerability detection due to its ability to automatically extract complex feature representations from code, eliminating the need for manually defined rules or patterns. Some methods treat code as text sequences, however, they often overlook its inherent structural information. In contrast, graph-based approaches effectively capture structural relationships, but the sparseness and inconsistency of structures may lead to uneven feature vector extraction, which means that the model may not be able to adequately characterize important nodes or paths. To address this issue, we propose an approach called Dual-channel Graph Neural Network combining Graph properties and Random walks (DC-GAR). This approach integrates graph properties and random walks within a dual-channel graph neural network framework to enhance vulnerability detection. Specifically, graph properties capture global semantic features, while random walks provide context-dependent node structure information. The combination of these features is then leveraged by the dual-channel graph neural network for detection and classification. We have implemented DC-GAR and evaluated it on a dataset of 29,514 functions. Experimental results demonstrate that DC-GAR surpasses state-of-the-art vulnerability detectors, including FlawFinder, SySeVR, Devign, VulCNN, AMPLE, HardVD, CodeBERT, and GraphCodeBERT in terms of accuracy and F1-Score. Moreover, DC-GAR has proven effective and practical in real-world open-source projects.
期刊介绍:
This journal details research, tutorial papers, survey and accounts of significant industrial experience in the foundations, techniques, tools and applications of automated software engineering technology. This includes the study of techniques for constructing, understanding, adapting, and modeling software artifacts and processes.
Coverage in Automated Software Engineering examines both automatic systems and collaborative systems as well as computational models of human software engineering activities. In addition, it presents knowledge representations and artificial intelligence techniques applicable to automated software engineering, and formal techniques that support or provide theoretical foundations. The journal also includes reviews of books, software, conferences and workshops.