SVA-ICL：通过上下文学习和信息融合改进基于llm的软件漏洞评估

IF 4.3 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information and Software Technology Pub Date : 2025-06-11 DOI:10.1016/j.infsof.2025.107803

Chaoyang Gao , Xiang Chen , Guangbei Zhang

{"title":"SVA-ICL：通过上下文学习和信息融合改进基于llm的软件漏洞评估","authors":"Chaoyang Gao , Xiang Chen , Guangbei Zhang","doi":"10.1016/j.infsof.2025.107803","DOIUrl":null,"url":null,"abstract":"<div><h3>Context:</h3><div>Software vulnerability assessment (SVA) is critical for identifying, evaluating, and prioritizing security weaknesses in software applications.</div></div><div><h3>Objective:</h3><div>Despite the increasing application of large language models (LLMs) in various software engineering tasks, their effectiveness in SVA remains underexplored.</div></div><div><h3>Method:</h3><div>To address this gap, we introduce a novel approach SVA-ICL, which leverages in-context learning (ICL) to enhance LLM performance. Our approach involves the selection of high-quality demonstrations for ICL through information fusion, incorporating both source code and vulnerability descriptions. For source code, we consider semantic, lexical, and syntactic similarities, while for vulnerability descriptions, we focus on textual similarity. Based on the selected demonstrations, we construct context prompts and consider DeepSeek-V2 as the LLM for SVA-ICL.</div></div><div><h3>Results:</h3><div>We evaluate the effectiveness of SVA-ICL using a large-scale dataset comprising 12,071 C/C++ vulnerabilities. Experimental results demonstrate that SVA-ICL outperforms state-of-the-art SVA baselines in terms of Accuracy, F1-score, and MCC measures. Furthermore, ablation studies highlight the significance of component customization in SVA-ICL, such as the number of demonstrations, the demonstration ordering strategy, and the optimal fusion ratio of different modalities.</div></div><div><h3>Conclusion:</h3><div>Our findings suggest that leveraging ICL with information fusion can effectively improve the effectiveness of LLM-based SVA, warranting further research in this direction.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"186 ","pages":"Article 107803"},"PeriodicalIF":4.3000,"publicationDate":"2025-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SVA-ICL: Improving LLM-based software vulnerability assessment via in-context learning and information fusion\",\"authors\":\"Chaoyang Gao , Xiang Chen , Guangbei Zhang\",\"doi\":\"10.1016/j.infsof.2025.107803\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Context:</h3><div>Software vulnerability assessment (SVA) is critical for identifying, evaluating, and prioritizing security weaknesses in software applications.</div></div><div><h3>Objective:</h3><div>Despite the increasing application of large language models (LLMs) in various software engineering tasks, their effectiveness in SVA remains underexplored.</div></div><div><h3>Method:</h3><div>To address this gap, we introduce a novel approach SVA-ICL, which leverages in-context learning (ICL) to enhance LLM performance. Our approach involves the selection of high-quality demonstrations for ICL through information fusion, incorporating both source code and vulnerability descriptions. For source code, we consider semantic, lexical, and syntactic similarities, while for vulnerability descriptions, we focus on textual similarity. Based on the selected demonstrations, we construct context prompts and consider DeepSeek-V2 as the LLM for SVA-ICL.</div></div><div><h3>Results:</h3><div>We evaluate the effectiveness of SVA-ICL using a large-scale dataset comprising 12,071 C/C++ vulnerabilities. Experimental results demonstrate that SVA-ICL outperforms state-of-the-art SVA baselines in terms of Accuracy, F1-score, and MCC measures. Furthermore, ablation studies highlight the significance of component customization in SVA-ICL, such as the number of demonstrations, the demonstration ordering strategy, and the optimal fusion ratio of different modalities.</div></div><div><h3>Conclusion:</h3><div>Our findings suggest that leveraging ICL with information fusion can effectively improve the effectiveness of LLM-based SVA, warranting further research in this direction.</div></div>\",\"PeriodicalId\":54983,\"journal\":{\"name\":\"Information and Software Technology\",\"volume\":\"186 \",\"pages\":\"Article 107803\"},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2025-06-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information and Software Technology\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0950584925001429\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information and Software Technology","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950584925001429","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

背景：软件漏洞评估（SVA）对于识别、评估软件应用程序中的安全弱点并确定其优先级至关重要。目的：尽管大型语言模型（llm）在各种软件工程任务中的应用越来越多，但它们在SVA中的有效性仍未得到充分探讨。方法：为了解决这一差距，我们引入了一种新的方法SVA-ICL，它利用上下文学习（ICL）来提高LLM的性能。我们的方法包括通过信息融合选择高质量的ICL演示，结合源代码和漏洞描述。对于源代码，我们考虑语义、词汇和语法相似性，而对于漏洞描述，我们关注文本相似性。基于所选的演示，我们构建了上下文提示，并将DeepSeek-V2视为SVA-ICL的LLM。结果：我们使用包含12071个C/ c++漏洞的大规模数据集评估了SVA-ICL的有效性。实验结果表明，SVA- icl在准确性、f1分数和MCC测量方面优于最先进的SVA基线。此外，消融研究强调了组件定制在SVA-ICL中的重要性，如演示次数、演示顺序策略和不同模式的最佳融合比。结论：我们的研究结果表明，利用ICL和信息融合可以有效提高基于llm的SVA的有效性，值得进一步研究。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

SVA-ICL: Improving LLM-based software vulnerability assessment via in-context learning and information fusion

Context:

Software vulnerability assessment (SVA) is critical for identifying, evaluating, and prioritizing security weaknesses in software applications.

Objective:

Despite the increasing application of large language models (LLMs) in various software engineering tasks, their effectiveness in SVA remains underexplored.

Method:

To address this gap, we introduce a novel approach SVA-ICL, which leverages in-context learning (ICL) to enhance LLM performance. Our approach involves the selection of high-quality demonstrations for ICL through information fusion, incorporating both source code and vulnerability descriptions. For source code, we consider semantic, lexical, and syntactic similarities, while for vulnerability descriptions, we focus on textual similarity. Based on the selected demonstrations, we construct context prompts and consider DeepSeek-V2 as the LLM for SVA-ICL.

Results:

We evaluate the effectiveness of SVA-ICL using a large-scale dataset comprising 12,071 C/C++ vulnerabilities. Experimental results demonstrate that SVA-ICL outperforms state-of-the-art SVA baselines in terms of Accuracy, F1-score, and MCC measures. Furthermore, ablation studies highlight the significance of component customization in SVA-ICL, such as the number of demonstrations, the demonstration ordering strategy, and the optimal fusion ratio of different modalities.

Conclusion:

Our findings suggest that leveraging ICL with information fusion can effectively improve the effectiveness of LLM-based SVA, warranting further research in this direction.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Information and Software Technology 工程技术-计算机：软件工程

CiteScore

9.10

自引率

7.70%

发文量

164

审稿时长

9.6 weeks

期刊介绍： Information and Software Technology is the international archival journal focusing on research and experience that contributes to the improvement of software development practices. The journal''s scope includes methods and techniques to better engineer software and manage its development. Articles submitted for review should have a clear component of software engineering or address ways to improve the engineering and management of software development. Areas covered by the journal include: • Software management, quality and metrics, • Software processes, • Software architecture, modelling, specification, design and programming • Functional and non-functional software requirements • Software testing and verification & validation • Empirical studies of all aspects of engineering and managing software development Short Communications is a new section dedicated to short papers addressing new ideas, controversial opinions, "Negative" results and much more. Read the Guide for authors for more information. The journal encourages and welcomes submissions of systematic literature studies (reviews and maps) within the scope of the journal. Information and Software Technology is the premiere outlet for systematic literature studies in software engineering.