从攻击描述到漏洞:基于句子转换器的方法

IF 4.1 2区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING
Refat Othman , Diaeddin Rimawi , Bruno Rossi , Barbara Russo
{"title":"从攻击描述到漏洞:基于句子转换器的方法","authors":"Refat Othman ,&nbsp;Diaeddin Rimawi ,&nbsp;Bruno Rossi ,&nbsp;Barbara Russo","doi":"10.1016/j.jss.2025.112615","DOIUrl":null,"url":null,"abstract":"<div><div>In the domain of security, vulnerabilities frequently remain undetected even after their exploitation. In this work, vulnerabilities refer to publicly disclosed flaws documented in Common Vulnerabilities and Exposures (CVE) reports. Establishing a connection between attacks and vulnerabilities is essential for enabling timely incident response, as it provides defenders with immediate, actionable insights. However, manually mapping attacks to CVEs is infeasible, thereby motivating the need for automation. This paper evaluates 14 state-of-the-art (SOTA) sentence transformers for automatically identifying vulnerabilities from textual descriptions of attacks. Our results demonstrate that the <span>multi-qa-mpnet-base-dot-v1 (MMPNet)</span> model achieves superior classification performance when using attack Technique descriptions, with an F<span><math><msub><mrow></mrow><mrow><mn>1</mn></mrow></msub></math></span>-score of 89.0, precision of 84.0, and recall of 94.7. Furthermore, it was observed that, on average, 56% of the vulnerabilities identified by the <span>MMPNet</span> model are also represented within the CVE repository in conjunction with an attack, while 61% of the vulnerabilities detected by the model correspond to those cataloged in the CVE repository. A manual inspection of the results revealed the existence of 275 predicted links that were not documented in the MITRE repositories. Consequently, the automation of linking attack techniques to vulnerabilities not only enhances the detection and response capabilities related to software security incidents but also diminishes the duration during which vulnerabilities remain exploitable, thereby contributing to the development of more secure systems.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"231 ","pages":"Article 112615"},"PeriodicalIF":4.1000,"publicationDate":"2025-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"From attack descriptions to vulnerabilities: A sentence transformer-based approach\",\"authors\":\"Refat Othman ,&nbsp;Diaeddin Rimawi ,&nbsp;Bruno Rossi ,&nbsp;Barbara Russo\",\"doi\":\"10.1016/j.jss.2025.112615\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>In the domain of security, vulnerabilities frequently remain undetected even after their exploitation. In this work, vulnerabilities refer to publicly disclosed flaws documented in Common Vulnerabilities and Exposures (CVE) reports. Establishing a connection between attacks and vulnerabilities is essential for enabling timely incident response, as it provides defenders with immediate, actionable insights. However, manually mapping attacks to CVEs is infeasible, thereby motivating the need for automation. This paper evaluates 14 state-of-the-art (SOTA) sentence transformers for automatically identifying vulnerabilities from textual descriptions of attacks. Our results demonstrate that the <span>multi-qa-mpnet-base-dot-v1 (MMPNet)</span> model achieves superior classification performance when using attack Technique descriptions, with an F<span><math><msub><mrow></mrow><mrow><mn>1</mn></mrow></msub></math></span>-score of 89.0, precision of 84.0, and recall of 94.7. Furthermore, it was observed that, on average, 56% of the vulnerabilities identified by the <span>MMPNet</span> model are also represented within the CVE repository in conjunction with an attack, while 61% of the vulnerabilities detected by the model correspond to those cataloged in the CVE repository. A manual inspection of the results revealed the existence of 275 predicted links that were not documented in the MITRE repositories. Consequently, the automation of linking attack techniques to vulnerabilities not only enhances the detection and response capabilities related to software security incidents but also diminishes the duration during which vulnerabilities remain exploitable, thereby contributing to the development of more secure systems.</div></div>\",\"PeriodicalId\":51099,\"journal\":{\"name\":\"Journal of Systems and Software\",\"volume\":\"231 \",\"pages\":\"Article 112615\"},\"PeriodicalIF\":4.1000,\"publicationDate\":\"2025-09-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Systems and Software\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0164121225002845\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Systems and Software","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0164121225002845","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0

摘要

在安全领域,漏洞即使在被利用后也经常未被发现。在这项工作中,漏洞指的是公共漏洞和暴露(Common vulnerabilities and Exposures, CVE)报告中公开披露的缺陷。在攻击和漏洞之间建立联系对于实现及时的事件响应至关重要,因为它为防御者提供了即时的、可操作的见解。然而,手动将攻击映射到cve是不可行的,因此激发了对自动化的需求。本文评估了14种最先进的(SOTA)句子转换器,用于从攻击的文本描述中自动识别漏洞。结果表明,MMPNet (multi-qa-mpnet-base-dot-v1)模型在使用攻击技术描述时取得了优异的分类性能,f1得分为89.0,准确率为84.0,召回率为94.7。此外,我们观察到,平均而言,MMPNet模型识别的56%的漏洞也与攻击一起在CVE存储库中表示,而该模型检测到的61%的漏洞与CVE存储库中编录的漏洞相对应。对结果的手工检查显示,有275个预测链接没有记录在MITRE存储库中。因此,将攻击技术与漏洞联系起来的自动化不仅增强了与软件安全事件相关的检测和响应能力,而且减少了漏洞被利用的持续时间,从而有助于开发更安全的系统。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
From attack descriptions to vulnerabilities: A sentence transformer-based approach
In the domain of security, vulnerabilities frequently remain undetected even after their exploitation. In this work, vulnerabilities refer to publicly disclosed flaws documented in Common Vulnerabilities and Exposures (CVE) reports. Establishing a connection between attacks and vulnerabilities is essential for enabling timely incident response, as it provides defenders with immediate, actionable insights. However, manually mapping attacks to CVEs is infeasible, thereby motivating the need for automation. This paper evaluates 14 state-of-the-art (SOTA) sentence transformers for automatically identifying vulnerabilities from textual descriptions of attacks. Our results demonstrate that the multi-qa-mpnet-base-dot-v1 (MMPNet) model achieves superior classification performance when using attack Technique descriptions, with an F1-score of 89.0, precision of 84.0, and recall of 94.7. Furthermore, it was observed that, on average, 56% of the vulnerabilities identified by the MMPNet model are also represented within the CVE repository in conjunction with an attack, while 61% of the vulnerabilities detected by the model correspond to those cataloged in the CVE repository. A manual inspection of the results revealed the existence of 275 predicted links that were not documented in the MITRE repositories. Consequently, the automation of linking attack techniques to vulnerabilities not only enhances the detection and response capabilities related to software security incidents but also diminishes the duration during which vulnerabilities remain exploitable, thereby contributing to the development of more secure systems.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Systems and Software
Journal of Systems and Software 工程技术-计算机:理论方法
CiteScore
8.60
自引率
5.70%
发文量
193
审稿时长
16 weeks
期刊介绍: The Journal of Systems and Software publishes papers covering all aspects of software engineering and related hardware-software-systems issues. All articles should include a validation of the idea presented, e.g. through case studies, experiments, or systematic comparisons with other approaches already in practice. Topics of interest include, but are not limited to: •Methods and tools for, and empirical studies on, software requirements, design, architecture, verification and validation, maintenance and evolution •Agile, model-driven, service-oriented, open source and global software development •Approaches for mobile, multiprocessing, real-time, distributed, cloud-based, dependable and virtualized systems •Human factors and management concerns of software development •Data management and big data issues of software systems •Metrics and evaluation, data mining of software development resources •Business and economic aspects of software development processes The journal welcomes state-of-the-art surveys and reports of practical experience for all of these topics.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信