{"title":"MLAF-VD: A vulnerability detection model based on multi-level abstract features","authors":"Qinghao Li, Wei Liu, Yisen Wang, Weiyu Dong","doi":"10.1016/j.jisa.2025.104189","DOIUrl":null,"url":null,"abstract":"<div><div>As key factors that threaten software security, software vulnerabilities need to be effectively detected. In recent years, with the prosperity of deep learning technology, the academic community has witnessed the emergence of numerous software vulnerability detection methods based on deep learning. These methods usually use different-level abstract features such as code snippets, AST, or CFG as feature representations of vulnerability samples, and then feed them into neural networks to learn patterns of the vulnerabilities. However, these abstract features lack direct relevance to vulnerability detection (i.e., they are not specifically designed for vulnerability detection), which makes it difficult for these abstract features to represent the vulnerability semantics accurately. In addition, single-level abstract features face challenges in comprehensively reflecting code information. In this paper, we propose a semantic-level danger structure graph (DSG), which aims to represent the semantic part of the code that is related to the vulnerability. A graph neural network with global attention, Global-GAT, is also proposed to capture the global dependencies of the graph representation. Based on DSG and Global-GAT, we propose a vulnerability detection model based on multi-level abstract features, named MLAF-VD. MLAF-VD learns the sequence-level, structure-level, and semantic-level abstract features of the code with multiple attention mechanisms, and alleviates the influence of noise information through a denoising module. We evaluate MLAF-VD on 3 representative public datasets, and the results show that MLAF-VD outperforms the best baseline methods by 4.88%, 7.40%, and 12.60% in terms of F1-Score, respectively. In practical applications, MLAF-VD detects 20 N-Day vulnerabilities from 6 open-source projects, demonstrating its effectiveness in detecting software vulnerabilities.</div></div>","PeriodicalId":48638,"journal":{"name":"Journal of Information Security and Applications","volume":"93 ","pages":"Article 104189"},"PeriodicalIF":3.7000,"publicationDate":"2025-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Information Security and Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2214212625002261","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
As key factors that threaten software security, software vulnerabilities need to be effectively detected. In recent years, with the prosperity of deep learning technology, the academic community has witnessed the emergence of numerous software vulnerability detection methods based on deep learning. These methods usually use different-level abstract features such as code snippets, AST, or CFG as feature representations of vulnerability samples, and then feed them into neural networks to learn patterns of the vulnerabilities. However, these abstract features lack direct relevance to vulnerability detection (i.e., they are not specifically designed for vulnerability detection), which makes it difficult for these abstract features to represent the vulnerability semantics accurately. In addition, single-level abstract features face challenges in comprehensively reflecting code information. In this paper, we propose a semantic-level danger structure graph (DSG), which aims to represent the semantic part of the code that is related to the vulnerability. A graph neural network with global attention, Global-GAT, is also proposed to capture the global dependencies of the graph representation. Based on DSG and Global-GAT, we propose a vulnerability detection model based on multi-level abstract features, named MLAF-VD. MLAF-VD learns the sequence-level, structure-level, and semantic-level abstract features of the code with multiple attention mechanisms, and alleviates the influence of noise information through a denoising module. We evaluate MLAF-VD on 3 representative public datasets, and the results show that MLAF-VD outperforms the best baseline methods by 4.88%, 7.40%, and 12.60% in terms of F1-Score, respectively. In practical applications, MLAF-VD detects 20 N-Day vulnerabilities from 6 open-source projects, demonstrating its effectiveness in detecting software vulnerabilities.
期刊介绍:
Journal of Information Security and Applications (JISA) focuses on the original research and practice-driven applications with relevance to information security and applications. JISA provides a common linkage between a vibrant scientific and research community and industry professionals by offering a clear view on modern problems and challenges in information security, as well as identifying promising scientific and "best-practice" solutions. JISA issues offer a balance between original research work and innovative industrial approaches by internationally renowned information security experts and researchers.