MCL-VD：基于lora增强GraphCodeBERT的多模态对比学习，用于有效的漏洞检测

IF 3.1 2区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Automated Software Engineering Pub Date : 2025-07-28 DOI:10.1007/s10515-025-00543-3

Yi Cao, Xiaolin Ju, Xiang Chen, Lina Gong

{"title":"MCL-VD：基于lora增强GraphCodeBERT的多模态对比学习，用于有效的漏洞检测","authors":"Yi Cao, Xiaolin Ju, Xiang Chen, Lina Gong","doi":"10.1007/s10515-025-00543-3","DOIUrl":null,"url":null,"abstract":"<div><p>Vulnerability detection in software systems is a critical challenge due to the increasing complexity of code and the rising frequency of security vulnerabilities. Traditional approaches typically rely on single-modality inputs and struggle to distinguish between similar code snippets. However, multi-modal methods find it challenging to balance performance and efficiency. To address these challenges, we propose MCL-VD, a framework that leverages multi-modal inputs including source code, code comments, and AST to capture complementary structural and contextual information. We employ LoRA, which reduces the computational burden by optimizing the number of trainable parameters without sacrificing performance. Additionally, we apply multi-modal contrastive learning to align and differentiate the representations across the three modalities, thereby enhancing the model’s discriminative power and robustness. We designed and conducted experiments on three public benchmark datasets, i.e., Devign, Reveal, and Big-Vul. The experimental results show that MCL-VD significantly outperforms the best-performing baselines, achieving F1-score improvements ranging from 4.86% to 17.26%. These results highlight the effectiveness of combining multi-modal contrastive learning with LoRA optimization, providing a powerful and efficient solution for vulnerability detection.</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"32 2","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2025-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MCL-VD: Multi-modal contrastive learning with LoRA-enhanced GraphCodeBERT for effective vulnerability detection\",\"authors\":\"Yi Cao, Xiaolin Ju, Xiang Chen, Lina Gong\",\"doi\":\"10.1007/s10515-025-00543-3\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Vulnerability detection in software systems is a critical challenge due to the increasing complexity of code and the rising frequency of security vulnerabilities. Traditional approaches typically rely on single-modality inputs and struggle to distinguish between similar code snippets. However, multi-modal methods find it challenging to balance performance and efficiency. To address these challenges, we propose MCL-VD, a framework that leverages multi-modal inputs including source code, code comments, and AST to capture complementary structural and contextual information. We employ LoRA, which reduces the computational burden by optimizing the number of trainable parameters without sacrificing performance. Additionally, we apply multi-modal contrastive learning to align and differentiate the representations across the three modalities, thereby enhancing the model’s discriminative power and robustness. We designed and conducted experiments on three public benchmark datasets, i.e., Devign, Reveal, and Big-Vul. The experimental results show that MCL-VD significantly outperforms the best-performing baselines, achieving F1-score improvements ranging from 4.86% to 17.26%. These results highlight the effectiveness of combining multi-modal contrastive learning with LoRA optimization, providing a powerful and efficient solution for vulnerability detection.</p></div>\",\"PeriodicalId\":55414,\"journal\":{\"name\":\"Automated Software Engineering\",\"volume\":\"32 2\",\"pages\":\"\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2025-07-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Automated Software Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10515-025-00543-3\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Automated Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10515-025-00543-3","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

摘要

由于代码的复杂性和安全漏洞的频繁出现，漏洞检测在软件系统中是一个关键的挑战。传统方法通常依赖于单模态输入，难以区分相似的代码片段。然而，多模态方法很难平衡性能和效率。为了应对这些挑战，我们提出了MCL-VD，这是一个利用多模态输入（包括源代码、代码注释和AST）来捕获互补结构和上下文信息的框架。我们采用LoRA，在不牺牲性能的情况下通过优化可训练参数的数量来减少计算负担。此外，我们应用多模态对比学习来对齐和区分三种模态的表征，从而增强模型的判别能力和鲁棒性。我们在三个公共基准数据集上设计并进行了实验，即Devign， Reveal和Big-Vul。实验结果表明，MCL-VD显著优于最佳基准，f1分数提高幅度为4.86%至17.26%。这些结果突出了多模态对比学习与LoRA优化相结合的有效性，为漏洞检测提供了一个强大而高效的解决方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

MCL-VD: Multi-modal contrastive learning with LoRA-enhanced GraphCodeBERT for effective vulnerability detection

查看原文本刊更多论文

MCL-VD: Multi-modal contrastive learning with LoRA-enhanced GraphCodeBERT for effective vulnerability detection

Vulnerability detection in software systems is a critical challenge due to the increasing complexity of code and the rising frequency of security vulnerabilities. Traditional approaches typically rely on single-modality inputs and struggle to distinguish between similar code snippets. However, multi-modal methods find it challenging to balance performance and efficiency. To address these challenges, we propose MCL-VD, a framework that leverages multi-modal inputs including source code, code comments, and AST to capture complementary structural and contextual information. We employ LoRA, which reduces the computational burden by optimizing the number of trainable parameters without sacrificing performance. Additionally, we apply multi-modal contrastive learning to align and differentiate the representations across the three modalities, thereby enhancing the model’s discriminative power and robustness. We designed and conducted experiments on three public benchmark datasets, i.e., Devign, Reveal, and Big-Vul. The experimental results show that MCL-VD significantly outperforms the best-performing baselines, achieving F1-score improvements ranging from 4.86% to 17.26%. These results highlight the effectiveness of combining multi-modal contrastive learning with LoRA optimization, providing a powerful and efficient solution for vulnerability detection.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Automated Software Engineering 工程技术-计算机：软件工程

CiteScore

4.80

自引率

11.80%

发文量

审稿时长

>12 weeks

期刊介绍： This journal details research, tutorial papers, survey and accounts of significant industrial experience in the foundations, techniques, tools and applications of automated software engineering technology. This includes the study of techniques for constructing, understanding, adapting, and modeling software artifacts and processes. Coverage in Automated Software Engineering examines both automatic systems and collaborative systems as well as computational models of human software engineering activities. In addition, it presents knowledge representations and artificial intelligence techniques applicable to automated software engineering, and formal techniques that support or provide theoretical foundations. The journal also includes reviews of books, software, conferences and workshops.