{"title":"基于异构图的漏洞可利用性预测","authors":"Guo Xu, Xin Chen, Xinxin Cai, Dongjin Yu","doi":"10.1016/j.knosys.2025.114517","DOIUrl":null,"url":null,"abstract":"<div><div>Vulnerability exploitability prediction is the process predicting the likelihood of being exploited in real attacks by the assessment of known software vulnerabilities. Many methods have been proposed to solve the problem of exploitability prediction. However, they generally suffer from two problems. First, they only extract features from a single vulnerability, ignoring the impact of associated vulnerabilities. Second, they usually adopt simple methods (such as concatenation) to aggregate different information, which may overlook important relationships between features. In this paper, we propose a novel exploitability prediction method based on heterogeneous graphs, called ExPreHet. First, ExPreHet defines nodes and edges to construct a heterogeneous graph. Following a series of preprocessing steps, ExPreHet generates multiple attribute vectors for each node. By implementing a restart random walk strategy, ExPreHet ensures that each node can sample all categories of neighboring nodes and group them by node category. Then, ExPreHet aggregates all the attributes of each node to generate the content vector, and each category of neighboring nodes of this node to generate a category vector. After that, the content vector and all the category vectors are aggregated to generate the final representation of the node. Finally, these final representations are input into random forest (RF) for training the classifier. To effectively assess ExPreHet, this paper conducts experiments on a dataset, which contains 66,877 vulnerabilities. The experimental results show that ExPreHet achieves 83.24 %, 83.22 %, 83.28 %, 83.25 %, and 83.24 % in terms of accuracy, precision, recall, F1-score, and area under curve (AUC), respectively. ExPreHet performs significantly better than the baseline methods.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"330 ","pages":"Article 114517"},"PeriodicalIF":7.6000,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Exploitability prediction of vulnerabilities based on heterogeneous graphs\",\"authors\":\"Guo Xu, Xin Chen, Xinxin Cai, Dongjin Yu\",\"doi\":\"10.1016/j.knosys.2025.114517\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Vulnerability exploitability prediction is the process predicting the likelihood of being exploited in real attacks by the assessment of known software vulnerabilities. Many methods have been proposed to solve the problem of exploitability prediction. However, they generally suffer from two problems. First, they only extract features from a single vulnerability, ignoring the impact of associated vulnerabilities. Second, they usually adopt simple methods (such as concatenation) to aggregate different information, which may overlook important relationships between features. In this paper, we propose a novel exploitability prediction method based on heterogeneous graphs, called ExPreHet. First, ExPreHet defines nodes and edges to construct a heterogeneous graph. Following a series of preprocessing steps, ExPreHet generates multiple attribute vectors for each node. By implementing a restart random walk strategy, ExPreHet ensures that each node can sample all categories of neighboring nodes and group them by node category. Then, ExPreHet aggregates all the attributes of each node to generate the content vector, and each category of neighboring nodes of this node to generate a category vector. After that, the content vector and all the category vectors are aggregated to generate the final representation of the node. Finally, these final representations are input into random forest (RF) for training the classifier. To effectively assess ExPreHet, this paper conducts experiments on a dataset, which contains 66,877 vulnerabilities. The experimental results show that ExPreHet achieves 83.24 %, 83.22 %, 83.28 %, 83.25 %, and 83.24 % in terms of accuracy, precision, recall, F1-score, and area under curve (AUC), respectively. ExPreHet performs significantly better than the baseline methods.</div></div>\",\"PeriodicalId\":49939,\"journal\":{\"name\":\"Knowledge-Based Systems\",\"volume\":\"330 \",\"pages\":\"Article 114517\"},\"PeriodicalIF\":7.6000,\"publicationDate\":\"2025-09-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Knowledge-Based Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0950705125015564\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705125015564","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Exploitability prediction of vulnerabilities based on heterogeneous graphs
Vulnerability exploitability prediction is the process predicting the likelihood of being exploited in real attacks by the assessment of known software vulnerabilities. Many methods have been proposed to solve the problem of exploitability prediction. However, they generally suffer from two problems. First, they only extract features from a single vulnerability, ignoring the impact of associated vulnerabilities. Second, they usually adopt simple methods (such as concatenation) to aggregate different information, which may overlook important relationships between features. In this paper, we propose a novel exploitability prediction method based on heterogeneous graphs, called ExPreHet. First, ExPreHet defines nodes and edges to construct a heterogeneous graph. Following a series of preprocessing steps, ExPreHet generates multiple attribute vectors for each node. By implementing a restart random walk strategy, ExPreHet ensures that each node can sample all categories of neighboring nodes and group them by node category. Then, ExPreHet aggregates all the attributes of each node to generate the content vector, and each category of neighboring nodes of this node to generate a category vector. After that, the content vector and all the category vectors are aggregated to generate the final representation of the node. Finally, these final representations are input into random forest (RF) for training the classifier. To effectively assess ExPreHet, this paper conducts experiments on a dataset, which contains 66,877 vulnerabilities. The experimental results show that ExPreHet achieves 83.24 %, 83.22 %, 83.28 %, 83.25 %, and 83.24 % in terms of accuracy, precision, recall, F1-score, and area under curve (AUC), respectively. ExPreHet performs significantly better than the baseline methods.
期刊介绍:
Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.