MLAF-VD: A vulnerability detection model based on multi-level abstract features

IF 3.7 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of Information Security and Applications Pub Date : 2025-08-13 DOI:10.1016/j.jisa.2025.104189

Qinghao Li, Wei Liu, Yisen Wang, Weiyu Dong

{"title":"MLAF-VD: A vulnerability detection model based on multi-level abstract features","authors":"Qinghao Li, Wei Liu, Yisen Wang, Weiyu Dong","doi":"10.1016/j.jisa.2025.104189","DOIUrl":null,"url":null,"abstract":"<div><div>As key factors that threaten software security, software vulnerabilities need to be effectively detected. In recent years, with the prosperity of deep learning technology, the academic community has witnessed the emergence of numerous software vulnerability detection methods based on deep learning. These methods usually use different-level abstract features such as code snippets, AST, or CFG as feature representations of vulnerability samples, and then feed them into neural networks to learn patterns of the vulnerabilities. However, these abstract features lack direct relevance to vulnerability detection (i.e., they are not specifically designed for vulnerability detection), which makes it difficult for these abstract features to represent the vulnerability semantics accurately. In addition, single-level abstract features face challenges in comprehensively reflecting code information. In this paper, we propose a semantic-level danger structure graph (DSG), which aims to represent the semantic part of the code that is related to the vulnerability. A graph neural network with global attention, Global-GAT, is also proposed to capture the global dependencies of the graph representation. Based on DSG and Global-GAT, we propose a vulnerability detection model based on multi-level abstract features, named MLAF-VD. MLAF-VD learns the sequence-level, structure-level, and semantic-level abstract features of the code with multiple attention mechanisms, and alleviates the influence of noise information through a denoising module. We evaluate MLAF-VD on 3 representative public datasets, and the results show that MLAF-VD outperforms the best baseline methods by 4.88%, 7.40%, and 12.60% in terms of F1-Score, respectively. In practical applications, MLAF-VD detects 20 N-Day vulnerabilities from 6 open-source projects, demonstrating its effectiveness in detecting software vulnerabilities.</div></div>","PeriodicalId":48638,"journal":{"name":"Journal of Information Security and Applications","volume":"93 ","pages":"Article 104189"},"PeriodicalIF":3.7000,"publicationDate":"2025-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Information Security and Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2214212625002261","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

As key factors that threaten software security, software vulnerabilities need to be effectively detected. In recent years, with the prosperity of deep learning technology, the academic community has witnessed the emergence of numerous software vulnerability detection methods based on deep learning. These methods usually use different-level abstract features such as code snippets, AST, or CFG as feature representations of vulnerability samples, and then feed them into neural networks to learn patterns of the vulnerabilities. However, these abstract features lack direct relevance to vulnerability detection (i.e., they are not specifically designed for vulnerability detection), which makes it difficult for these abstract features to represent the vulnerability semantics accurately. In addition, single-level abstract features face challenges in comprehensively reflecting code information. In this paper, we propose a semantic-level danger structure graph (DSG), which aims to represent the semantic part of the code that is related to the vulnerability. A graph neural network with global attention, Global-GAT, is also proposed to capture the global dependencies of the graph representation. Based on DSG and Global-GAT, we propose a vulnerability detection model based on multi-level abstract features, named MLAF-VD. MLAF-VD learns the sequence-level, structure-level, and semantic-level abstract features of the code with multiple attention mechanisms, and alleviates the influence of noise information through a denoising module. We evaluate MLAF-VD on 3 representative public datasets, and the results show that MLAF-VD outperforms the best baseline methods by 4.88%, 7.40%, and 12.60% in terms of F1-Score, respectively. In practical applications, MLAF-VD detects 20 N-Day vulnerabilities from 6 open-source projects, demonstrating its effectiveness in detecting software vulnerabilities.

查看原文本刊更多论文

MLAF-VD：一种基于多级抽象特征的漏洞检测模型

软件漏洞作为威胁软件安全的关键因素，需要对其进行有效的检测。近年来，随着深度学习技术的蓬勃发展，学术界出现了许多基于深度学习的软件漏洞检测方法。这些方法通常使用不同级别的抽象特征（如代码片段、AST或CFG）作为漏洞样本的特征表示，然后将其输入神经网络以学习漏洞的模式。然而，这些抽象特征与漏洞检测缺乏直接的相关性（即它们不是专门为漏洞检测而设计的），这使得这些抽象特征难以准确地表示漏洞语义。此外，单级抽象特征在全面反映代码信息方面面临挑战。在本文中，我们提出了一个语义级的危险结构图（DSG），旨在表示代码中与漏洞相关的语义部分。本文还提出了一种具有全局关注的图神经网络global - gat来捕获图表示的全局依赖关系。基于DSG和Global-GAT，提出了一种基于多级抽象特征的漏洞检测模型MLAF-VD。MLAF-VD通过多种注意机制学习代码的序列级、结构级和语义级抽象特征，并通过去噪模块减轻噪声信息的影响。我们在3个具有代表性的公共数据集上对MLAF-VD进行了评估，结果表明MLAF-VD在F1-Score方面分别优于最佳基线方法4.88%，7.40%和12.60%。在实际应用中，MLAF-VD检测了来自6个开源项目的20个N-Day漏洞，证明了其检测软件漏洞的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Information Security and Applications Computer Science-Computer Networks and Communications

CiteScore

10.90

自引率

5.40%

发文量

206

审稿时长

56 days

期刊介绍： Journal of Information Security and Applications (JISA) focuses on the original research and practice-driven applications with relevance to information security and applications. JISA provides a common linkage between a vibrant scientific and research community and industry professionals by offering a clear view on modern problems and challenges in information security, as well as identifying promising scientific and "best-practice" solutions. JISA issues offer a balance between original research work and innovative industrial approaches by internationally renowned information security experts and researchers.