PIONEER: improving the robustness of student models when compressing pre-trained models of code

IF 3.1 2区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Automated Software Engineering Pub Date : 2025-09-23 DOI:10.1007/s10515-025-00560-2

Xiangyue Liu, Xinwei Liu, Lili Bo, Xiaoxue Wu, Yun Yang, Xiaobing Sun, Feng Zhou

{"title":"PIONEER: improving the robustness of student models when compressing pre-trained models of code","authors":"Xiangyue Liu, Xinwei Liu, Lili Bo, Xiaoxue Wu, Yun Yang, Xiaobing Sun, Feng Zhou","doi":"10.1007/s10515-025-00560-2","DOIUrl":null,"url":null,"abstract":"<div>Pre-trained models of code have shown significant effectiveness in a variety of software engineering tasks, but they are difficult for local deployment due to their large size. Existing works mainly focus on compressing these large models into small models to achieve similar performance and efficient inference. However, it is ignored that the small models should be robust enough to deal with adversarial examples that make incorrect predictions to users. Knowledge distillation techniques typically transform the model compression problem into a combinatorial optimization problem of the student architecture space to achieve the best student model performance. But they can only improve the robustness of the student model to a limited extent through traditional adversarial training. This paper proposes PIONEER (ImProvIng the RObustness of StudeNt ModEls WhEn CompRessing Code Models), a novel knowledge distillation technique that enhances the robustness of the student model without requiring adversarial training. PIONEER incorporates robustness evaluation during distillation to guide the optimization of the student model architecture. By using the probability distributions of original examples and adversarial examples as soft labels, the student model learns the features of both the original samples and adversarial examples during training. We conduct experimental evaluations on two downstream tasks (vulnerability prediction and clone detection) for the three models (CodeBERT, GraphCodeBERT, and CodeT5). We utilize PIONEER to compress six downstream task models to small (3 MB) models that are 206\\(\\times\\) smaller than the original size. The results show that compressed models reduce the inference latency (76\\(\\times\\)) and improve the robustness of the model (87.54%) with negligible loss of effectiveness (1.67%).</div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"33 1","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Automated Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10515-025-00560-2","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

Pre-trained models of code have shown significant effectiveness in a variety of software engineering tasks, but they are difficult for local deployment due to their large size. Existing works mainly focus on compressing these large models into small models to achieve similar performance and efficient inference. However, it is ignored that the small models should be robust enough to deal with adversarial examples that make incorrect predictions to users. Knowledge distillation techniques typically transform the model compression problem into a combinatorial optimization problem of the student architecture space to achieve the best student model performance. But they can only improve the robustness of the student model to a limited extent through traditional adversarial training. This paper proposes PIONEER (ImProvIng the RObustness of StudeNt ModEls WhEn CompRessing Code Models), a novel knowledge distillation technique that enhances the robustness of the student model without requiring adversarial training. PIONEER incorporates robustness evaluation during distillation to guide the optimization of the student model architecture. By using the probability distributions of original examples and adversarial examples as soft labels, the student model learns the features of both the original samples and adversarial examples during training. We conduct experimental evaluations on two downstream tasks (vulnerability prediction and clone detection) for the three models (CodeBERT, GraphCodeBERT, and CodeT5). We utilize PIONEER to compress six downstream task models to small (3 MB) models that are 206\(\times\) smaller than the original size. The results show that compressed models reduce the inference latency (76\(\times\)) and improve the robustness of the model (87.54%) with negligible loss of effectiveness (1.67%).

Abstract Image

查看原文本刊更多论文

PIONEER：在压缩预训练的代码模型时，提高学生模型的鲁棒性

预训练的代码模型在各种软件工程任务中显示出显著的有效性，但是由于它们的规模太大，很难在本地部署。现有的工作主要集中在将这些大模型压缩成小模型，以达到相似的性能和高效的推理。然而，它忽略了小模型应该足够健壮，以处理对用户做出错误预测的对抗性示例。知识蒸馏技术通常将模型压缩问题转化为学生体系结构空间的组合优化问题，以获得最佳的学生模型性能。但通过传统的对抗性训练，只能在有限程度上提高学生模型的鲁棒性。本文提出了一种新的知识蒸馏技术PIONEER (improved the鲁棒性of StudeNt ModEls WhEn compressed Code ModEls)，它可以在不需要对抗性训练的情况下增强学生模型的鲁棒性。先锋在蒸馏过程中纳入鲁棒性评估，以指导学生模型架构的优化。通过使用原始样本和对抗样本的概率分布作为软标签，学生模型在训练过程中学习原始样本和对抗样本的特征。我们对三个模型（CodeBERT、GraphCodeBERT和CodeT5）的两个下游任务（漏洞预测和克隆检测）进行了实验评估。我们利用PIONEER将6个下游任务模型压缩为比原始大小小206 \(\times\)的小（3 MB）模型。结果表明，压缩模型减少了推理延迟（76 \(\times\)），提高了模型的鲁棒性（87.54）%) with negligible loss of effectiveness (1.67%).

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Automated Software Engineering 工程技术-计算机：软件工程

CiteScore

4.80

自引率

11.80%

发文量

审稿时长

>12 weeks

期刊介绍： This journal details research, tutorial papers, survey and accounts of significant industrial experience in the foundations, techniques, tools and applications of automated software engineering technology. This includes the study of techniques for constructing, understanding, adapting, and modeling software artifacts and processes. Coverage in Automated Software Engineering examines both automatic systems and collaborative systems as well as computational models of human software engineering activities. In addition, it presents knowledge representations and artificial intelligence techniques applicable to automated software engineering, and formal techniques that support or provide theoretical foundations. The journal also includes reviews of books, software, conferences and workshops.