{"title":"MCL-VD:基于lora增强GraphCodeBERT的多模态对比学习,用于有效的漏洞检测","authors":"Yi Cao, Xiaolin Ju, Xiang Chen, Lina Gong","doi":"10.1007/s10515-025-00543-3","DOIUrl":null,"url":null,"abstract":"<div><p>Vulnerability detection in software systems is a critical challenge due to the increasing complexity of code and the rising frequency of security vulnerabilities. Traditional approaches typically rely on single-modality inputs and struggle to distinguish between similar code snippets. However, multi-modal methods find it challenging to balance performance and efficiency. To address these challenges, we propose MCL-VD, a framework that leverages multi-modal inputs including source code, code comments, and AST to capture complementary structural and contextual information. We employ LoRA, which reduces the computational burden by optimizing the number of trainable parameters without sacrificing performance. Additionally, we apply multi-modal contrastive learning to align and differentiate the representations across the three modalities, thereby enhancing the model’s discriminative power and robustness. We designed and conducted experiments on three public benchmark datasets, i.e., Devign, Reveal, and Big-Vul. The experimental results show that MCL-VD significantly outperforms the best-performing baselines, achieving F1-score improvements ranging from 4.86% to 17.26%. These results highlight the effectiveness of combining multi-modal contrastive learning with LoRA optimization, providing a powerful and efficient solution for vulnerability detection.</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"32 2","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2025-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MCL-VD: Multi-modal contrastive learning with LoRA-enhanced GraphCodeBERT for effective vulnerability detection\",\"authors\":\"Yi Cao, Xiaolin Ju, Xiang Chen, Lina Gong\",\"doi\":\"10.1007/s10515-025-00543-3\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Vulnerability detection in software systems is a critical challenge due to the increasing complexity of code and the rising frequency of security vulnerabilities. Traditional approaches typically rely on single-modality inputs and struggle to distinguish between similar code snippets. However, multi-modal methods find it challenging to balance performance and efficiency. To address these challenges, we propose MCL-VD, a framework that leverages multi-modal inputs including source code, code comments, and AST to capture complementary structural and contextual information. We employ LoRA, which reduces the computational burden by optimizing the number of trainable parameters without sacrificing performance. Additionally, we apply multi-modal contrastive learning to align and differentiate the representations across the three modalities, thereby enhancing the model’s discriminative power and robustness. We designed and conducted experiments on three public benchmark datasets, i.e., Devign, Reveal, and Big-Vul. The experimental results show that MCL-VD significantly outperforms the best-performing baselines, achieving F1-score improvements ranging from 4.86% to 17.26%. These results highlight the effectiveness of combining multi-modal contrastive learning with LoRA optimization, providing a powerful and efficient solution for vulnerability detection.</p></div>\",\"PeriodicalId\":55414,\"journal\":{\"name\":\"Automated Software Engineering\",\"volume\":\"32 2\",\"pages\":\"\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2025-07-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Automated Software Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10515-025-00543-3\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Automated Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10515-025-00543-3","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
MCL-VD: Multi-modal contrastive learning with LoRA-enhanced GraphCodeBERT for effective vulnerability detection
Vulnerability detection in software systems is a critical challenge due to the increasing complexity of code and the rising frequency of security vulnerabilities. Traditional approaches typically rely on single-modality inputs and struggle to distinguish between similar code snippets. However, multi-modal methods find it challenging to balance performance and efficiency. To address these challenges, we propose MCL-VD, a framework that leverages multi-modal inputs including source code, code comments, and AST to capture complementary structural and contextual information. We employ LoRA, which reduces the computational burden by optimizing the number of trainable parameters without sacrificing performance. Additionally, we apply multi-modal contrastive learning to align and differentiate the representations across the three modalities, thereby enhancing the model’s discriminative power and robustness. We designed and conducted experiments on three public benchmark datasets, i.e., Devign, Reveal, and Big-Vul. The experimental results show that MCL-VD significantly outperforms the best-performing baselines, achieving F1-score improvements ranging from 4.86% to 17.26%. These results highlight the effectiveness of combining multi-modal contrastive learning with LoRA optimization, providing a powerful and efficient solution for vulnerability detection.
期刊介绍:
This journal details research, tutorial papers, survey and accounts of significant industrial experience in the foundations, techniques, tools and applications of automated software engineering technology. This includes the study of techniques for constructing, understanding, adapting, and modeling software artifacts and processes.
Coverage in Automated Software Engineering examines both automatic systems and collaborative systems as well as computational models of human software engineering activities. In addition, it presents knowledge representations and artificial intelligence techniques applicable to automated software engineering, and formal techniques that support or provide theoretical foundations. The journal also includes reviews of books, software, conferences and workshops.