DPEfficR: a data and parameter efficient approach for training neural API recommendation model

IF 3.1 2区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING
Haibo Yu, Xiaohong Han, Simin Chen, Xiaoning Feng, Guangzhao Sun, Wei Yang
{"title":"DPEfficR: a data and parameter efficient approach for training neural API recommendation model","authors":"Haibo Yu,&nbsp;Xiaohong Han,&nbsp;Simin Chen,&nbsp;Xiaoning Feng,&nbsp;Guangzhao Sun,&nbsp;Wei Yang","doi":"10.1007/s10515-025-00530-8","DOIUrl":null,"url":null,"abstract":"<div><p>Recommending application programming interfaces (APIs) is practical and essential in today’s programming landscape. An accurate API recommendation system could significantly improve developers’ coding efficiency. State-of-the-art (SOTA) API recommendation systems typically employ deep learning models as the backend model. However, training the backend deep learning model for API recommendation systems poses a challenging task due to the significant effort required for data labeling and the need for extensive computations. These challenges deeply affect the process of updating an existing API recommendation system when the API evolves. To address these issues, this paper proposes <span>DPEfficR</span>, a data and parameter efficient method for building API recommendation systems. Specifically, <span>DPEfficR</span> includes (1) the data selection module; (2) the task-specific parameter tuning module; and (3) the runtime API selection module. The data selection module selects representative data, while the task-specific parameter tuning module tunes pre-trained LLMs with a small number of parameters. Once the LLM is well-tuned, the runtime API selection module searches for a more accurate API sequence through consistency checking. We compare our approach against seven baseline methods, which belong to three different types. Our comprehensive evaluation demonstrates the effectiveness of our approach in recommending a more accurate API sequence, achieving improvements of 40% in BLEU-4 and 25% in ROUGE-2 over the baseline methods, with only <span>\\(\\varvec{3.61 \\times 10}^{\\varvec{4}}\\)</span> tunable parameters, representing just 0.049% of the parameters used in the baseline methods. Moreover, our ablation study demonstrates the effectiveness of the proposed modules in our systems.</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"32 2","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Automated Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10515-025-00530-8","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0

Abstract

Recommending application programming interfaces (APIs) is practical and essential in today’s programming landscape. An accurate API recommendation system could significantly improve developers’ coding efficiency. State-of-the-art (SOTA) API recommendation systems typically employ deep learning models as the backend model. However, training the backend deep learning model for API recommendation systems poses a challenging task due to the significant effort required for data labeling and the need for extensive computations. These challenges deeply affect the process of updating an existing API recommendation system when the API evolves. To address these issues, this paper proposes DPEfficR, a data and parameter efficient method for building API recommendation systems. Specifically, DPEfficR includes (1) the data selection module; (2) the task-specific parameter tuning module; and (3) the runtime API selection module. The data selection module selects representative data, while the task-specific parameter tuning module tunes pre-trained LLMs with a small number of parameters. Once the LLM is well-tuned, the runtime API selection module searches for a more accurate API sequence through consistency checking. We compare our approach against seven baseline methods, which belong to three different types. Our comprehensive evaluation demonstrates the effectiveness of our approach in recommending a more accurate API sequence, achieving improvements of 40% in BLEU-4 and 25% in ROUGE-2 over the baseline methods, with only \(\varvec{3.61 \times 10}^{\varvec{4}}\) tunable parameters, representing just 0.049% of the parameters used in the baseline methods. Moreover, our ablation study demonstrates the effectiveness of the proposed modules in our systems.

Abstract Image

DPEfficR:一种数据和参数有效的神经API推荐模型训练方法
在当今的编程环境中,推荐应用程序编程接口(api)是实用且必要的。一个准确的API推荐系统可以显著提高开发人员的编码效率。最先进(SOTA) API推荐系统通常使用深度学习模型作为后端模型。然而,训练API推荐系统的后端深度学习模型是一项具有挑战性的任务,因为数据标记需要大量的工作,并且需要大量的计算。当API发展时,这些挑战会严重影响现有API推荐系统的更新过程。为了解决这些问题,本文提出了一种数据和参数高效的构建API推荐系统的方法DPEfficR。具体来说,DPEfficR包括(1)数据选择模块;(2)任务参数调优模块;(3)运行时API选择模块。数据选择模块选择具有代表性的数据,而特定于任务的参数调优模块则对带有少量参数的预训练llm进行调优。LLM调优后,运行时API选择模块通过一致性检查搜索更准确的API序列。我们将我们的方法与属于三种不同类型的七种基线方法进行比较。我们的综合评估证明了我们的方法在推荐更准确的API序列方面的有效性,实现了40的改进% in BLEU-4 and 25% in ROUGE-2 over the baseline methods, with only \(\varvec{3.61 \times 10}^{\varvec{4}}\) tunable parameters, representing just 0.049% of the parameters used in the baseline methods. Moreover, our ablation study demonstrates the effectiveness of the proposed modules in our systems.
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Automated Software Engineering
Automated Software Engineering 工程技术-计算机:软件工程
CiteScore
4.80
自引率
11.80%
发文量
51
审稿时长
>12 weeks
期刊介绍: This journal details research, tutorial papers, survey and accounts of significant industrial experience in the foundations, techniques, tools and applications of automated software engineering technology. This includes the study of techniques for constructing, understanding, adapting, and modeling software artifacts and processes. Coverage in Automated Software Engineering examines both automatic systems and collaborative systems as well as computational models of human software engineering activities. In addition, it presents knowledge representations and artificial intelligence techniques applicable to automated software engineering, and formal techniques that support or provide theoretical foundations. The journal also includes reviews of books, software, conferences and workshops.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信