RFMC-CS：一种基于表示融合的多视图动量对比学习框架

IF 3.1 2区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Automated Software Engineering Pub Date : 2025-01-27 DOI:10.1007/s10515-025-00487-8

Gong Chen, Wenjie Liu, Xiaoyuan Xie

{"title":"RFMC-CS：一种基于表示融合的多视图动量对比学习框架","authors":"Gong Chen, Wenjie Liu, Xiaoyuan Xie","doi":"10.1007/s10515-025-00487-8","DOIUrl":null,"url":null,"abstract":"<div>Code search is a crucial task in software engineering, aiming to search relevant code from the codebase based on natural language queries. While deep-learning-based code search methods have demonstrated impressive performance, recent advances in contrastive learning have further enhanced the representation learning of these models. Despite these improvements, existing methods still have limitations in the representation learning of multi-modal data. Specifically, these methods suffer from a semantic loss in the representation learning of code and fail to explore functionally relevant code pairs in the representation learning fully. To address these limitations, we propose A Representation Fusion based Multi-View Momentum Contrastive Learning Framework for Code Search, named RFMC-CS. RFMC-CS effectively retains the semantic and structural information of code through multi-modal representation and fusion. Through elaborately designed Multi-View Momentum Contrastive Learning, RFMC-CS can further learn the correlations between different modalities of samples and semantic relevant samples. The experimental results on the CodeSearchNet benchmark show that RFMC-CS outperforms seven advanced baselines on MRR and Recall@k metrics. The ablation experiments illustrate the effectiveness of each component. The portability experiments show that RFMC-CS has good portability.</div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"32 1","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"RFMC-CS: a representation fusion based multi-view momentum contrastive learning framework for code search\",\"authors\":\"Gong Chen, Wenjie Liu, Xiaoyuan Xie\",\"doi\":\"10.1007/s10515-025-00487-8\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>Code search is a crucial task in software engineering, aiming to search relevant code from the codebase based on natural language queries. While deep-learning-based code search methods have demonstrated impressive performance, recent advances in contrastive learning have further enhanced the representation learning of these models. Despite these improvements, existing methods still have limitations in the representation learning of multi-modal data. Specifically, these methods suffer from a semantic loss in the representation learning of code and fail to explore functionally relevant code pairs in the representation learning fully. To address these limitations, we propose A Representation Fusion based Multi-View Momentum Contrastive Learning Framework for Code Search, named RFMC-CS. RFMC-CS effectively retains the semantic and structural information of code through multi-modal representation and fusion. Through elaborately designed Multi-View Momentum Contrastive Learning, RFMC-CS can further learn the correlations between different modalities of samples and semantic relevant samples. The experimental results on the CodeSearchNet benchmark show that RFMC-CS outperforms seven advanced baselines on MRR and Recall@k metrics. The ablation experiments illustrate the effectiveness of each component. The portability experiments show that RFMC-CS has good portability.</div>\",\"PeriodicalId\":55414,\"journal\":{\"name\":\"Automated Software Engineering\",\"volume\":\"32 1\",\"pages\":\"\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2025-01-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Automated Software Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10515-025-00487-8\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Automated Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10515-025-00487-8","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

摘要

代码搜索是软件工程中的一项重要任务，其目的是基于自然语言查询从代码库中搜索相关代码。虽然基于深度学习的代码搜索方法已经展示了令人印象深刻的性能，但对比学习的最新进展进一步增强了这些模型的表示学习。尽管有了这些改进，但现有的方法在多模态数据的表示学习中仍然存在局限性。具体而言，这些方法在代码的表示学习中存在语义缺失，未能充分探索表示学习中功能相关的代码对。为了解决这些限制，我们提出了一种基于表示融合的多视图动量对比学习框架，命名为RFMC-CS。RFMC-CS通过多模态表示和融合，有效地保留了代码的语义和结构信息。通过精心设计的多视图动量对比学习，RFMC-CS可以进一步学习样本的不同模态和语义相关样本之间的相关性。在CodeSearchNet基准上的实验结果表明，RFMC-CS在MRR和Recall@k指标上优于7个高级基准。烧蚀实验验证了各组分的有效性。可移植性实验表明，RFMC-CS具有良好的可移植性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

RFMC-CS: a representation fusion based multi-view momentum contrastive learning framework for code search

Code search is a crucial task in software engineering, aiming to search relevant code from the codebase based on natural language queries. While deep-learning-based code search methods have demonstrated impressive performance, recent advances in contrastive learning have further enhanced the representation learning of these models. Despite these improvements, existing methods still have limitations in the representation learning of multi-modal data. Specifically, these methods suffer from a semantic loss in the representation learning of code and fail to explore functionally relevant code pairs in the representation learning fully. To address these limitations, we propose A Representation Fusion based Multi-View Momentum Contrastive Learning Framework for Code Search, named RFMC-CS. RFMC-CS effectively retains the semantic and structural information of code through multi-modal representation and fusion. Through elaborately designed Multi-View Momentum Contrastive Learning, RFMC-CS can further learn the correlations between different modalities of samples and semantic relevant samples. The experimental results on the CodeSearchNet benchmark show that RFMC-CS outperforms seven advanced baselines on MRR and Recall@k metrics. The ablation experiments illustrate the effectiveness of each component. The portability experiments show that RFMC-CS has good portability.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Automated Software Engineering 工程技术-计算机：软件工程

CiteScore

4.80

自引率

11.80%

发文量

审稿时长

>12 weeks

期刊介绍： This journal details research, tutorial papers, survey and accounts of significant industrial experience in the foundations, techniques, tools and applications of automated software engineering technology. This includes the study of techniques for constructing, understanding, adapting, and modeling software artifacts and processes. Coverage in Automated Software Engineering examines both automatic systems and collaborative systems as well as computational models of human software engineering activities. In addition, it presents knowledge representations and artificial intelligence techniques applicable to automated software engineering, and formal techniques that support or provide theoretical foundations. The journal also includes reviews of books, software, conferences and workshops.