Automated detection of affected libraries from vulnerability reports

IF 3.1 2区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Automated Software Engineering Pub Date : 2025-08-11 DOI:10.1007/s10515-025-00540-6

Jinwei Xu, He Zhang, Xin Zhou, Yanjing Yang, Runfeng Mao, Xiaokang Li, Lanxin Yang, Haifeng Shen

{"title":"Automated detection of affected libraries from vulnerability reports","authors":"Jinwei Xu, He Zhang, Xin Zhou, Yanjing Yang, Runfeng Mao, Xiaokang Li, Lanxin Yang, Haifeng Shen","doi":"10.1007/s10515-025-00540-6","DOIUrl":null,"url":null,"abstract":"<div><p>The growing reuse of third-party libraries in software supply chains increases the risk of being affected by the involved vulnerabilities. To strengthen software security, <i>security vendors</i> such as Snyk manage up-to-date vulnerability databases by associating reported vulnerabilities with their affected libraries, and <i>contemporary digital organizations</i> such as banking and software enterprises detect the third-party libraries they use if affected by these reported vulnerabilities. Existing studies focus on automating the detection process but make few efforts on detecting newly affected libraries, although new libraries (previously healthy) are constantly disclosed to be affected by new vulnerabilities. Moreover, existing studies do not seriously consider digital organizations’ concerns only about the libraries they use. In this paper, we propose an approach <b>LibAlarm</b> to address these challenges. We implement LibAlarm as a large language model-powered approach and compare it with the baseline approaches from multiple perspectives. Our experimental evaluation using 16,238 NVD reports indicates that LibAlarm improves the F1 by over 14% compared with baselines and detects over 40% newly affected libraries. For contemporary digital organizations, LibAlarm performs better than the baseline approaches with the F1 above 70% and the reduced false alarm ratio to 20%. Our case analysis using 540 NVD reports and 20 projects from Microsoft and Google demonstrates the effectiveness of LibAlarm. These results indicate that LibAlarm can help security vendors and digital organizations detect affected libraries from vulnerability reports.</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"32 2","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2025-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Automated Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10515-025-00540-6","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

The growing reuse of third-party libraries in software supply chains increases the risk of being affected by the involved vulnerabilities. To strengthen software security, security vendors such as Snyk manage up-to-date vulnerability databases by associating reported vulnerabilities with their affected libraries, and contemporary digital organizations such as banking and software enterprises detect the third-party libraries they use if affected by these reported vulnerabilities. Existing studies focus on automating the detection process but make few efforts on detecting newly affected libraries, although new libraries (previously healthy) are constantly disclosed to be affected by new vulnerabilities. Moreover, existing studies do not seriously consider digital organizations’ concerns only about the libraries they use. In this paper, we propose an approach LibAlarm to address these challenges. We implement LibAlarm as a large language model-powered approach and compare it with the baseline approaches from multiple perspectives. Our experimental evaluation using 16,238 NVD reports indicates that LibAlarm improves the F1 by over 14% compared with baselines and detects over 40% newly affected libraries. For contemporary digital organizations, LibAlarm performs better than the baseline approaches with the F1 above 70% and the reduced false alarm ratio to 20%. Our case analysis using 540 NVD reports and 20 projects from Microsoft and Google demonstrates the effectiveness of LibAlarm. These results indicate that LibAlarm can help security vendors and digital organizations detect affected libraries from vulnerability reports.

查看原文本刊更多论文

从漏洞报告中自动检测受影响的库

软件供应链中不断增长的第三方库重用增加了受相关漏洞影响的风险。为了加强软件安全性，Snyk等安全供应商通过将报告的漏洞与受影响的库相关联来管理最新的漏洞数据库，而当代数字组织（如银行和软件企业）则检测他们使用的第三方库，如果受到这些报告的漏洞的影响。现有的研究侧重于自动化检测过程，但很少致力于检测新受影响的库，尽管新库（以前健康的）不断被披露受到新漏洞的影响。此外，现有的研究并没有认真考虑数字组织只关心他们使用的图书馆。在本文中，我们提出了一种方法LibAlarm来解决这些挑战。我们将LibAlarm作为一种大型语言模型支持的方法来实现，并从多个角度将其与基线方法进行比较。我们使用16,238份NVD报告进行的实验评估表明，与基线相比，LibAlarm将F1提高了14%以上，并检测到超过40%的新受影响的库。对于当代数字组织，LibAlarm的性能优于基线方法，F1高于70%，误报率降至20%。我们使用540份NVD报告和来自Microsoft和b谷歌的20个项目进行案例分析，证明了LibAlarm的有效性。这些结果表明，LibAlarm可以帮助安全供应商和数字组织从漏洞报告中检测受影响的库。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Automated Software Engineering 工程技术-计算机：软件工程

CiteScore

4.80

自引率

11.80%

发文量

审稿时长

>12 weeks

期刊介绍： This journal details research, tutorial papers, survey and accounts of significant industrial experience in the foundations, techniques, tools and applications of automated software engineering technology. This includes the study of techniques for constructing, understanding, adapting, and modeling software artifacts and processes. Coverage in Automated Software Engineering examines both automatic systems and collaborative systems as well as computational models of human software engineering activities. In addition, it presents knowledge representations and artificial intelligence techniques applicable to automated software engineering, and formal techniques that support or provide theoretical foundations. The journal also includes reviews of books, software, conferences and workshops.