漏洞利用:基于主题的机器学习框架，用于解释和预测漏洞利用

Inf. Comput. Pub Date : 2023-07-14 DOI:10.3390/info14070403

Konstantinos Charmanas, N. Mittas, L. Angelis

{"title":"漏洞利用:基于主题的机器学习框架，用于解释和预测漏洞利用","authors":"Konstantinos Charmanas, N. Mittas, L. Angelis","doi":"10.3390/info14070403","DOIUrl":null,"url":null,"abstract":"Security vulnerabilities constitute one of the most important weaknesses of hardware and software security that can cause severe damage to systems, applications, and users. As a result, software vendors should prioritize the most dangerous and impactful security vulnerabilities by developing appropriate countermeasures. As we acknowledge the importance of vulnerability prioritization, in the present study, we propose a framework that maps newly disclosed vulnerabilities with topic distributions, via word clustering, and further predicts whether this new entry will be associated with a potential exploit Proof Of Concept (POC). We also provide insights on the current most exploitable weaknesses and products through a Generalized Linear Model (GLM) that links the topic memberships of vulnerabilities with exploit indicators, thus distinguishing five topics that are associated with relatively frequent recent exploits. Our experiments show that the proposed framework can outperform two baseline topic modeling algorithms in terms of topic coherence by improving LDA models by up to 55%. In terms of classification performance, the conducted experiments—on a quite balanced dataset (57% negative observations, 43% positive observations)—indicate that the vulnerability descriptions can be used as exclusive features in assessing the exploitability of vulnerabilities, as the “best” model achieves accuracy close to 87%. Overall, our study contributes to enabling the prioritization of vulnerabilities by providing guidelines on the relations between the textual details of a weakness and the potential application/system exploits.","PeriodicalId":13622,"journal":{"name":"Inf. Comput.","volume":"1 1","pages":"403"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Exploitation of Vulnerabilities: A Topic-Based Machine Learning Framework for Explaining and Predicting Exploitation\",\"authors\":\"Konstantinos Charmanas, N. Mittas, L. Angelis\",\"doi\":\"10.3390/info14070403\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Security vulnerabilities constitute one of the most important weaknesses of hardware and software security that can cause severe damage to systems, applications, and users. As a result, software vendors should prioritize the most dangerous and impactful security vulnerabilities by developing appropriate countermeasures. As we acknowledge the importance of vulnerability prioritization, in the present study, we propose a framework that maps newly disclosed vulnerabilities with topic distributions, via word clustering, and further predicts whether this new entry will be associated with a potential exploit Proof Of Concept (POC). We also provide insights on the current most exploitable weaknesses and products through a Generalized Linear Model (GLM) that links the topic memberships of vulnerabilities with exploit indicators, thus distinguishing five topics that are associated with relatively frequent recent exploits. Our experiments show that the proposed framework can outperform two baseline topic modeling algorithms in terms of topic coherence by improving LDA models by up to 55%. In terms of classification performance, the conducted experiments—on a quite balanced dataset (57% negative observations, 43% positive observations)—indicate that the vulnerability descriptions can be used as exclusive features in assessing the exploitability of vulnerabilities, as the “best” model achieves accuracy close to 87%. Overall, our study contributes to enabling the prioritization of vulnerabilities by providing guidelines on the relations between the textual details of a weakness and the potential application/system exploits.\",\"PeriodicalId\":13622,\"journal\":{\"name\":\"Inf. Comput.\",\"volume\":\"1 1\",\"pages\":\"403\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-07-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Inf. Comput.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3390/info14070403\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Inf. Comput.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/info14070403","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

安全漏洞是硬件和软件安全最重要的弱点之一，可能会对系统、应用程序和用户造成严重损害。因此，软件供应商应该通过开发适当的对策来优先处理最危险和最具影响力的安全漏洞。由于我们认识到漏洞优先级的重要性，在本研究中，我们提出了一个框架，该框架通过词聚类将新披露的漏洞与主题分布进行映射，并进一步预测该新条目是否与潜在的漏洞相关概念验证(POC)。我们还通过广义线性模型(GLM)提供了关于当前最易利用的弱点和产品的见解，该模型将漏洞的主题成员关系与漏洞利用指标联系起来，从而区分出与相对频繁的最近漏洞利用相关的五个主题。我们的实验表明，通过将LDA模型提高55%，该框架在主题一致性方面优于两种基线主题建模算法。在分类性能方面，在一个相当平衡的数据集(57%的负面观察值，43%的正面观察值)上进行的实验表明，漏洞描述可以作为评估漏洞可利用性的唯一特征，因为“最佳”模型的准确率接近87%。总的来说，我们的研究通过提供弱点的文本细节与潜在的应用程序/系统漏洞之间的关系的指导方针，有助于实现漏洞的优先级。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Exploitation of Vulnerabilities: A Topic-Based Machine Learning Framework for Explaining and Predicting Exploitation

Security vulnerabilities constitute one of the most important weaknesses of hardware and software security that can cause severe damage to systems, applications, and users. As a result, software vendors should prioritize the most dangerous and impactful security vulnerabilities by developing appropriate countermeasures. As we acknowledge the importance of vulnerability prioritization, in the present study, we propose a framework that maps newly disclosed vulnerabilities with topic distributions, via word clustering, and further predicts whether this new entry will be associated with a potential exploit Proof Of Concept (POC). We also provide insights on the current most exploitable weaknesses and products through a Generalized Linear Model (GLM) that links the topic memberships of vulnerabilities with exploit indicators, thus distinguishing five topics that are associated with relatively frequent recent exploits. Our experiments show that the proposed framework can outperform two baseline topic modeling algorithms in terms of topic coherence by improving LDA models by up to 55%. In terms of classification performance, the conducted experiments—on a quite balanced dataset (57% negative observations, 43% positive observations)—indicate that the vulnerability descriptions can be used as exclusive features in assessing the exploitability of vulnerabilities, as the “best” model achieves accuracy close to 87%. Overall, our study contributes to enabling the prioritization of vulnerabilities by providing guidelines on the relations between the textual details of a weakness and the potential application/system exploits.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Inf. Comput.

自引率

0.00%

发文量