{"title":"HGAN4VD: Leveraging Heterogeneous Graph Attention Networks for enhanced vulnerability detection","authors":"Yucheng Zhang , Xiaolin Ju , Xiang Chen , Amin Misbahul , Zilong Ren","doi":"10.1016/j.cose.2025.104548","DOIUrl":null,"url":null,"abstract":"<div><div>Detecting vulnerabilities is crucial for mitigating inherent risks in software systems. In recent years, there has been a significant increase in developing effective vulnerability detection approaches, many of which leverage deep learning technologies. These methods provide notable advantages, including automated feature extraction and the ability to train models autonomously, thereby improving the efficiency and accuracy of the detection process. However, existing methods encounter two significant limitations. Firstly, code analysis lacks granularity and does not fully leverage semantic and syntactic information within code structures, resulting in suboptimal performance. Secondly, approaches based on Graph Neural Networks (GNNs) inherently struggle to capture long-distance relationships between nodes in code structures. In this paper, we propose HGAN4VD, a novel vulnerability detection method that utilizes heterogeneous intermediate source code representations to address these limitations. HGAN4VD comprises two components: a heterogeneous code representation graph, which is constructed by creating diverse code representations and simplifying the graph to reduce node distances, and a Heterogeneous Graph Attention Network, which incorporates two attention layers to calculate node-level and semantic-level attention. Experiments on three widely used datasets demonstrate that HGAN4VD outperforms state-of-the-art methods by 1.5% to 7.7% in accuracy and 3.8% to 12.2% in F1 score metrics, affirming its effectiveness in learning global information for code graphs used in vulnerability detection. Furthermore, we demonstrate the generalization capability of our method on Java and Python datasets, suggesting its potential for broader applicability.</div></div>","PeriodicalId":51004,"journal":{"name":"Computers & Security","volume":"157 ","pages":"Article 104548"},"PeriodicalIF":4.8000,"publicationDate":"2025-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Security","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167404825002378","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Detecting vulnerabilities is crucial for mitigating inherent risks in software systems. In recent years, there has been a significant increase in developing effective vulnerability detection approaches, many of which leverage deep learning technologies. These methods provide notable advantages, including automated feature extraction and the ability to train models autonomously, thereby improving the efficiency and accuracy of the detection process. However, existing methods encounter two significant limitations. Firstly, code analysis lacks granularity and does not fully leverage semantic and syntactic information within code structures, resulting in suboptimal performance. Secondly, approaches based on Graph Neural Networks (GNNs) inherently struggle to capture long-distance relationships between nodes in code structures. In this paper, we propose HGAN4VD, a novel vulnerability detection method that utilizes heterogeneous intermediate source code representations to address these limitations. HGAN4VD comprises two components: a heterogeneous code representation graph, which is constructed by creating diverse code representations and simplifying the graph to reduce node distances, and a Heterogeneous Graph Attention Network, which incorporates two attention layers to calculate node-level and semantic-level attention. Experiments on three widely used datasets demonstrate that HGAN4VD outperforms state-of-the-art methods by 1.5% to 7.7% in accuracy and 3.8% to 12.2% in F1 score metrics, affirming its effectiveness in learning global information for code graphs used in vulnerability detection. Furthermore, we demonstrate the generalization capability of our method on Java and Python datasets, suggesting its potential for broader applicability.
期刊介绍:
Computers & Security is the most respected technical journal in the IT security field. With its high-profile editorial board and informative regular features and columns, the journal is essential reading for IT security professionals around the world.
Computers & Security provides you with a unique blend of leading edge research and sound practical management advice. It is aimed at the professional involved with computer security, audit, control and data integrity in all sectors - industry, commerce and academia. Recognized worldwide as THE primary source of reference for applied research and technical expertise it is your first step to fully secure systems.