Miaogui Ling, Mingwei Tang, Deng Bian, Shixuan Lv, Qi Tang
{"title":"A dual graph neural networks model using sequence embedding as graph nodes for vulnerability detection","authors":"Miaogui Ling, Mingwei Tang, Deng Bian, Shixuan Lv, Qi Tang","doi":"10.1016/j.infsof.2024.107581","DOIUrl":null,"url":null,"abstract":"<div><h3>Context:</h3><p>Detecting critical to ensure software system security. The traditional static vulnerability detection methods are limited by staff expertise and perform poorly with today’s increasingly complex software systems. Researchers have successfully applied the techniques used in NLP to vulnerability detection as deep learning has developed. The existing deep learning-based vulnerability detection models can be divided into sequence-based and graph-based categories. Sequence-based embedding models cannot use structured information embedded in the code, and graph-based embedding models lack effective node representations.</p></div><div><h3>Objective:</h3><p>To solve these problems, we propose a deep learning-based method, DGVD (Double Graph Neural Network for Vulnerability Detection).</p></div><div><h3>Methods:</h3><p>We use the sequential neural network approach to extract local semantic features of the code as nodes embedded in the control flow graph. First, we propose a dual graph neural network module (DualGNN) that consists of GCN and GAT. The altered module utilizes two different graph neural networks to obtain the global structural information of the control flow and the relationship between the nodes and fuses the two. Second, we propose a convolution-based feature enhancement module (TC-FE) that uses different convolution kernels of different sizes to capture information at different scales so that subsequent readout layers can better aggregate node information.</p></div><div><h3>Results:</h3><p>Experiments demonstrate that DGVD outperforms existing models, obtaining 64.23% vulnerability detection accuracy on CodeXGLUE’s real benchmark dataset.</p></div><div><h3>Conclusion:</h3><p>The proposed DGVD achieves better performance than the state-of-the-art DGVD has a more effective source code feature extraction capability on real-world datasets.</p></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"177 ","pages":"Article 107581"},"PeriodicalIF":3.8000,"publicationDate":"2024-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0950584924001861/pdfft?md5=450f5d915db5cb174d591dea662c75cd&pid=1-s2.0-S0950584924001861-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information and Software Technology","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950584924001861","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Context:
Detecting critical to ensure software system security. The traditional static vulnerability detection methods are limited by staff expertise and perform poorly with today’s increasingly complex software systems. Researchers have successfully applied the techniques used in NLP to vulnerability detection as deep learning has developed. The existing deep learning-based vulnerability detection models can be divided into sequence-based and graph-based categories. Sequence-based embedding models cannot use structured information embedded in the code, and graph-based embedding models lack effective node representations.
Objective:
To solve these problems, we propose a deep learning-based method, DGVD (Double Graph Neural Network for Vulnerability Detection).
Methods:
We use the sequential neural network approach to extract local semantic features of the code as nodes embedded in the control flow graph. First, we propose a dual graph neural network module (DualGNN) that consists of GCN and GAT. The altered module utilizes two different graph neural networks to obtain the global structural information of the control flow and the relationship between the nodes and fuses the two. Second, we propose a convolution-based feature enhancement module (TC-FE) that uses different convolution kernels of different sizes to capture information at different scales so that subsequent readout layers can better aggregate node information.
Results:
Experiments demonstrate that DGVD outperforms existing models, obtaining 64.23% vulnerability detection accuracy on CodeXGLUE’s real benchmark dataset.
Conclusion:
The proposed DGVD achieves better performance than the state-of-the-art DGVD has a more effective source code feature extraction capability on real-world datasets.
期刊介绍:
Information and Software Technology is the international archival journal focusing on research and experience that contributes to the improvement of software development practices. The journal''s scope includes methods and techniques to better engineer software and manage its development. Articles submitted for review should have a clear component of software engineering or address ways to improve the engineering and management of software development. Areas covered by the journal include:
• Software management, quality and metrics,
• Software processes,
• Software architecture, modelling, specification, design and programming
• Functional and non-functional software requirements
• Software testing and verification & validation
• Empirical studies of all aspects of engineering and managing software development
Short Communications is a new section dedicated to short papers addressing new ideas, controversial opinions, "Negative" results and much more. Read the Guide for authors for more information.
The journal encourages and welcomes submissions of systematic literature studies (reviews and maps) within the scope of the journal. Information and Software Technology is the premiere outlet for systematic literature studies in software engineering.