检测和增加漏洞描述中缺失的关键方面

ACM Transactions on Software Engineering and Methodology (TOSEM) Pub Date : 2022-01-31 DOI:10.1145/3498537

Hao Guo, Sen Chen, Zhenchang Xing, Xiaohong Li, Yude Bai, Jiamou Sun

{"title":"检测和增加漏洞描述中缺失的关键方面","authors":"Hao Guo, Sen Chen, Zhenchang Xing, Xiaohong Li, Yude Bai, Jiamou Sun","doi":"10.1145/3498537","DOIUrl":null,"url":null,"abstract":"Security vulnerabilities have been continually disclosed and documented. For the effective understanding, management, and mitigation of the fast-growing number of vulnerabilities, an important practice in documenting vulnerabilities is to describe the key vulnerability aspects, such as vulnerability type, root cause, affected product, impact, attacker type, and attack vector. In this article, we first investigate 133,639 vulnerability reports in the Common Vulnerabilities and Exposures (CVE) database over the past 20 years. We find that 56%, 85%, 38%, and 28% of CVEs miss vulnerability type, root cause, attack vector, and attacker type, respectively. By comparing the differences of the latest updated CVE reports across different databases, we observe that 1,476 missing key aspects in 1,320 CVE descriptions were augmented manually in the National Vulnerability Database (NVD), which indicates that the vulnerability database maintainers try to complete the vulnerability descriptions in practice to mitigate such a problem. To help complete the missing information of key vulnerability aspects and reduce human efforts, we propose a neural-network-based approach called PMA to predict the missing key aspects of a vulnerability based on its known aspects. We systematically explore the design space of the neural network models and empirically identify the most effective model design in the scenario. Our ablation study reveals the prominent correlations among vulnerability aspects when predicting. Trained with historical CVEs, our model achieves 88%, 71%, 61%, and 81% in F1 for predicting the missing vulnerability type, root cause, attacker type, and attack vector of 8,623 “future” CVEs across 3 years, respectively. Furthermore, we validate the predicting performance of key aspect augmentation of CVEs based on the manually augmented CVE data collected from NVD, which confirms the practicality of our approach. We finally highlight that PMA has the ability to reduce human efforts by recommending and augmenting missing key aspects for vulnerability databases, and to facilitate other research works such as severity level prediction of CVEs based on the vulnerability descriptions.","PeriodicalId":7398,"journal":{"name":"ACM Transactions on Software Engineering and Methodology (TOSEM)","volume":"28 1","pages":"1 - 27"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Detecting and Augmenting Missing Key Aspects in Vulnerability Descriptions\",\"authors\":\"Hao Guo, Sen Chen, Zhenchang Xing, Xiaohong Li, Yude Bai, Jiamou Sun\",\"doi\":\"10.1145/3498537\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Security vulnerabilities have been continually disclosed and documented. For the effective understanding, management, and mitigation of the fast-growing number of vulnerabilities, an important practice in documenting vulnerabilities is to describe the key vulnerability aspects, such as vulnerability type, root cause, affected product, impact, attacker type, and attack vector. In this article, we first investigate 133,639 vulnerability reports in the Common Vulnerabilities and Exposures (CVE) database over the past 20 years. We find that 56%, 85%, 38%, and 28% of CVEs miss vulnerability type, root cause, attack vector, and attacker type, respectively. By comparing the differences of the latest updated CVE reports across different databases, we observe that 1,476 missing key aspects in 1,320 CVE descriptions were augmented manually in the National Vulnerability Database (NVD), which indicates that the vulnerability database maintainers try to complete the vulnerability descriptions in practice to mitigate such a problem. To help complete the missing information of key vulnerability aspects and reduce human efforts, we propose a neural-network-based approach called PMA to predict the missing key aspects of a vulnerability based on its known aspects. We systematically explore the design space of the neural network models and empirically identify the most effective model design in the scenario. Our ablation study reveals the prominent correlations among vulnerability aspects when predicting. Trained with historical CVEs, our model achieves 88%, 71%, 61%, and 81% in F1 for predicting the missing vulnerability type, root cause, attacker type, and attack vector of 8,623 “future” CVEs across 3 years, respectively. Furthermore, we validate the predicting performance of key aspect augmentation of CVEs based on the manually augmented CVE data collected from NVD, which confirms the practicality of our approach. We finally highlight that PMA has the ability to reduce human efforts by recommending and augmenting missing key aspects for vulnerability databases, and to facilitate other research works such as severity level prediction of CVEs based on the vulnerability descriptions.\",\"PeriodicalId\":7398,\"journal\":{\"name\":\"ACM Transactions on Software Engineering and Methodology (TOSEM)\",\"volume\":\"28 1\",\"pages\":\"1 - 27\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-01-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Software Engineering and Methodology (TOSEM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3498537\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Software Engineering and Methodology (TOSEM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3498537","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 10

摘要

安全漏洞不断被披露和记录。为了有效地理解、管理和缓解快速增长的漏洞数量，记录漏洞的一个重要实践是描述关键的漏洞方面，例如漏洞类型、根本原因、受影响的产品、影响、攻击者类型和攻击向量。在本文中，我们首先调查了过去20年来CVE数据库中的133,639个漏洞报告。我们发现56%、85%、38%和28%的cve分别遗漏了漏洞类型、根本原因、攻击媒介和攻击者类型。通过比较不同数据库间最新更新的CVE报告的差异，我们发现在国家漏洞数据库(NVD)中，1320个CVE描述中有1476个缺失的关键方面被人工补充，这表明漏洞数据库维护者在实践中努力完善漏洞描述以缓解这一问题。为了帮助补全关键漏洞方面的缺失信息，减少人工工作量，我们提出了一种基于神经网络的PMA方法，该方法基于已知的漏洞方面来预测缺失的关键漏洞方面。我们系统地探索了神经网络模型的设计空间，并经验地确定了场景中最有效的模型设计。我们的消融研究揭示了脆弱性方面在预测时的显著相关性。通过对历史cve的训练，我们的模型在3年内预测8623个“未来”cve的缺失漏洞类型、根本原因、攻击者类型和攻击向量的F1分别达到88%、71%、61%和81%。此外，基于NVD采集的人工增强CVE数据，验证了CVE关键面向增强的预测性能，验证了该方法的实用性。最后，我们强调PMA能够通过推荐和增加漏洞数据库中缺失的关键方面来减少人工工作量，并促进其他研究工作，如基于漏洞描述的cve严重级别预测。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Detecting and Augmenting Missing Key Aspects in Vulnerability Descriptions

Security vulnerabilities have been continually disclosed and documented. For the effective understanding, management, and mitigation of the fast-growing number of vulnerabilities, an important practice in documenting vulnerabilities is to describe the key vulnerability aspects, such as vulnerability type, root cause, affected product, impact, attacker type, and attack vector. In this article, we first investigate 133,639 vulnerability reports in the Common Vulnerabilities and Exposures (CVE) database over the past 20 years. We find that 56%, 85%, 38%, and 28% of CVEs miss vulnerability type, root cause, attack vector, and attacker type, respectively. By comparing the differences of the latest updated CVE reports across different databases, we observe that 1,476 missing key aspects in 1,320 CVE descriptions were augmented manually in the National Vulnerability Database (NVD), which indicates that the vulnerability database maintainers try to complete the vulnerability descriptions in practice to mitigate such a problem. To help complete the missing information of key vulnerability aspects and reduce human efforts, we propose a neural-network-based approach called PMA to predict the missing key aspects of a vulnerability based on its known aspects. We systematically explore the design space of the neural network models and empirically identify the most effective model design in the scenario. Our ablation study reveals the prominent correlations among vulnerability aspects when predicting. Trained with historical CVEs, our model achieves 88%, 71%, 61%, and 81% in F1 for predicting the missing vulnerability type, root cause, attacker type, and attack vector of 8,623 “future” CVEs across 3 years, respectively. Furthermore, we validate the predicting performance of key aspect augmentation of CVEs based on the manually augmented CVE data collected from NVD, which confirms the practicality of our approach. We finally highlight that PMA has the ability to reduce human efforts by recommending and augmenting missing key aspects for vulnerability databases, and to facilitate other research works such as severity level prediction of CVEs based on the vulnerability descriptions.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM Transactions on Software Engineering and Methodology (TOSEM)

自引率

0.00%

发文量