HGAN4VD: Leveraging Heterogeneous Graph Attention Networks for enhanced vulnerability detection

IF 4.8 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Computers & Security Pub Date : 2025-06-14 DOI:10.1016/j.cose.2025.104548

Yucheng Zhang , Xiaolin Ju , Xiang Chen , Amin Misbahul , Zilong Ren

{"title":"HGAN4VD: Leveraging Heterogeneous Graph Attention Networks for enhanced vulnerability detection","authors":"Yucheng Zhang , Xiaolin Ju , Xiang Chen , Amin Misbahul , Zilong Ren","doi":"10.1016/j.cose.2025.104548","DOIUrl":null,"url":null,"abstract":"<div><div>Detecting vulnerabilities is crucial for mitigating inherent risks in software systems. In recent years, there has been a significant increase in developing effective vulnerability detection approaches, many of which leverage deep learning technologies. These methods provide notable advantages, including automated feature extraction and the ability to train models autonomously, thereby improving the efficiency and accuracy of the detection process. However, existing methods encounter two significant limitations. Firstly, code analysis lacks granularity and does not fully leverage semantic and syntactic information within code structures, resulting in suboptimal performance. Secondly, approaches based on Graph Neural Networks (GNNs) inherently struggle to capture long-distance relationships between nodes in code structures. In this paper, we propose HGAN4VD, a novel vulnerability detection method that utilizes heterogeneous intermediate source code representations to address these limitations. HGAN4VD comprises two components: a heterogeneous code representation graph, which is constructed by creating diverse code representations and simplifying the graph to reduce node distances, and a Heterogeneous Graph Attention Network, which incorporates two attention layers to calculate node-level and semantic-level attention. Experiments on three widely used datasets demonstrate that HGAN4VD outperforms state-of-the-art methods by 1.5% to 7.7% in accuracy and 3.8% to 12.2% in F1 score metrics, affirming its effectiveness in learning global information for code graphs used in vulnerability detection. Furthermore, we demonstrate the generalization capability of our method on Java and Python datasets, suggesting its potential for broader applicability.</div></div>","PeriodicalId":51004,"journal":{"name":"Computers & Security","volume":"157 ","pages":"Article 104548"},"PeriodicalIF":4.8000,"publicationDate":"2025-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Security","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167404825002378","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Detecting vulnerabilities is crucial for mitigating inherent risks in software systems. In recent years, there has been a significant increase in developing effective vulnerability detection approaches, many of which leverage deep learning technologies. These methods provide notable advantages, including automated feature extraction and the ability to train models autonomously, thereby improving the efficiency and accuracy of the detection process. However, existing methods encounter two significant limitations. Firstly, code analysis lacks granularity and does not fully leverage semantic and syntactic information within code structures, resulting in suboptimal performance. Secondly, approaches based on Graph Neural Networks (GNNs) inherently struggle to capture long-distance relationships between nodes in code structures. In this paper, we propose HGAN4VD, a novel vulnerability detection method that utilizes heterogeneous intermediate source code representations to address these limitations. HGAN4VD comprises two components: a heterogeneous code representation graph, which is constructed by creating diverse code representations and simplifying the graph to reduce node distances, and a Heterogeneous Graph Attention Network, which incorporates two attention layers to calculate node-level and semantic-level attention. Experiments on three widely used datasets demonstrate that HGAN4VD outperforms state-of-the-art methods by 1.5% to 7.7% in accuracy and 3.8% to 12.2% in F1 score metrics, affirming its effectiveness in learning global information for code graphs used in vulnerability detection. Furthermore, we demonstrate the generalization capability of our method on Java and Python datasets, suggesting its potential for broader applicability.

查看原文本刊更多论文

HGAN4VD：利用异构图注意网络增强漏洞检测

检测漏洞对于减轻软件系统中的固有风险至关重要。近年来，开发有效漏洞检测方法的工作显著增加，其中许多方法利用了深度学习技术。这些方法具有显著的优势，包括自动特征提取和自主训练模型的能力，从而提高了检测过程的效率和准确性。然而，现有的方法遇到了两个显著的限制。首先，代码分析缺乏粒度，不能充分利用代码结构中的语义和语法信息，导致性能不理想。其次，基于图神经网络（gnn）的方法固有地难以捕捉代码结构中节点之间的远距离关系。在本文中，我们提出了一种新的漏洞检测方法HGAN4VD，该方法利用异构中间源代码表示来解决这些限制。HGAN4VD包括两个组成部分：一个是异构代码表示图，该图通过创建不同的代码表示并简化图以减少节点距离来构建；一个是异构图注意网络，该网络包含两个注意层，用于计算节点级和语义级的注意。在三个广泛使用的数据集上的实验表明，HGAN4VD在准确率上比最先进的方法高出1.5%至7.7%，在F1得分指标上比最先进的方法高出3.8%至12.2%，证实了它在学习漏洞检测中使用的代码图的全局信息方面的有效性。此外，我们展示了我们的方法在Java和Python数据集上的泛化能力，表明其具有更广泛的适用性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computers & Security 工程技术-计算机：信息系统

CiteScore

12.40

自引率

7.10%

发文量

365

审稿时长

10.7 months

期刊介绍： Computers & Security is the most respected technical journal in the IT security field. With its high-profile editorial board and informative regular features and columns, the journal is essential reading for IT security professionals around the world. Computers & Security provides you with a unique blend of leading edge research and sound practical management advice. It is aimed at the professional involved with computer security, audit, control and data integrity in all sectors - industry, commerce and academia. Recognized worldwide as THE primary source of reference for applied research and technical expertise it is your first step to fully secure systems.