E-code: Mastering efficient code generation through pretrained models and expert encoder group

IF 3.8 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS
Yue Pan , Chen Lyu , Zhenyu Yang , Lantian Li , Qi Liu , Xiuting Shao
{"title":"E-code: Mastering efficient code generation through pretrained models and expert encoder group","authors":"Yue Pan ,&nbsp;Chen Lyu ,&nbsp;Zhenyu Yang ,&nbsp;Lantian Li ,&nbsp;Qi Liu ,&nbsp;Xiuting Shao","doi":"10.1016/j.infsof.2024.107602","DOIUrl":null,"url":null,"abstract":"<div><h3>Context:</h3><div>With the waning of Moore’s Law, the software industry is placing increasing importance on finding alternative solutions for continuous performance enhancement. The significance and research results of software performance optimization have been on the rise in recent years, especially with the advancement propelled by <strong>L</strong>arge <strong>L</strong>anguage <strong>M</strong>odel<strong>s</strong> (LLMs). However, traditional strategies for rectifying performance flaws have shown significant limitations at the competitive code efficiency optimization level, and research on this topic is surprisingly scarce.</div></div><div><h3>Objective:</h3><div>This study aims to address the research gap in this domain, offering practical solutions to the various challenges encountered. Specifically, we have overcome the constraints of traditional performance error rectification strategies and developed a <strong>L</strong>anguage <strong>M</strong>odel (LM) tailored for the competitive code efficiency optimization realm.</div></div><div><h3>Methods:</h3><div>We introduced E-code, an advanced program synthesis LM. Inspired by the recent success of expert LMs, we designed an innovative structure called the Expert Encoder Group. This structure employs multiple expert encoders to extract features tailored for different input types. We assessed the performance of E-code against other leading models on a competitive dataset and conducted in-depth ablation experiments.</div></div><div><h3>Results:</h3><div>Upon systematic evaluation, E-code achieved a 54.98% improvement in code efficiency, significantly outperforming other advanced models. In the ablation experiments, we further validated the significance of the expert encoder group and other components within E-code.</div></div><div><h3>Conclusion:</h3><div>The research findings indicate that the expert encoder group can effectively handle various inputs in efficiency optimization tasks, significantly enhancing the model’s performance. In summary, this study paves new avenues for developing systems and methods to assist programmers in writing efficient code.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"178 ","pages":"Article 107602"},"PeriodicalIF":3.8000,"publicationDate":"2024-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information and Software Technology","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950584924002076","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Context:

With the waning of Moore’s Law, the software industry is placing increasing importance on finding alternative solutions for continuous performance enhancement. The significance and research results of software performance optimization have been on the rise in recent years, especially with the advancement propelled by Large Language Models (LLMs). However, traditional strategies for rectifying performance flaws have shown significant limitations at the competitive code efficiency optimization level, and research on this topic is surprisingly scarce.

Objective:

This study aims to address the research gap in this domain, offering practical solutions to the various challenges encountered. Specifically, we have overcome the constraints of traditional performance error rectification strategies and developed a Language Model (LM) tailored for the competitive code efficiency optimization realm.

Methods:

We introduced E-code, an advanced program synthesis LM. Inspired by the recent success of expert LMs, we designed an innovative structure called the Expert Encoder Group. This structure employs multiple expert encoders to extract features tailored for different input types. We assessed the performance of E-code against other leading models on a competitive dataset and conducted in-depth ablation experiments.

Results:

Upon systematic evaluation, E-code achieved a 54.98% improvement in code efficiency, significantly outperforming other advanced models. In the ablation experiments, we further validated the significance of the expert encoder group and other components within E-code.

Conclusion:

The research findings indicate that the expert encoder group can effectively handle various inputs in efficiency optimization tasks, significantly enhancing the model’s performance. In summary, this study paves new avenues for developing systems and methods to assist programmers in writing efficient code.
电子编码:通过预训练模型和专家编码器小组掌握高效代码生成
背景:随着摩尔定律的消退,软件行业越来越重视寻找其他解决方案来不断提高性能。近年来,软件性能优化的重要性和研究成果不断增加,尤其是在大型语言模型(LLM)的推动下。然而,传统的性能缺陷修正策略在竞争性代码效率优化层面显示出明显的局限性,而这方面的研究却少得令人吃惊。具体来说,我们克服了传统性能纠错策略的限制,开发了一种专为竞争性代码效率优化领域量身定制的语言模型(LM)。受最近成功的专家 LM 的启发,我们设计了一种名为专家编码器组的创新结构。这种结构采用多个专家编码器,针对不同的输入类型提取特征。结果:经过系统评估,E-code 的代码效率提高了 54.98%,明显优于其他先进模型。在消融实验中,我们进一步验证了专家编码器组和 E-code 中其他组件的重要性。结论:研究结果表明,专家编码器组能有效处理效率优化任务中的各种输入,从而大大提高了模型的性能。总之,这项研究为开发帮助程序员编写高效代码的系统和方法铺平了新的道路。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Information and Software Technology
Information and Software Technology 工程技术-计算机:软件工程
CiteScore
9.10
自引率
7.70%
发文量
164
审稿时长
9.6 weeks
期刊介绍: Information and Software Technology is the international archival journal focusing on research and experience that contributes to the improvement of software development practices. The journal''s scope includes methods and techniques to better engineer software and manage its development. Articles submitted for review should have a clear component of software engineering or address ways to improve the engineering and management of software development. Areas covered by the journal include: • Software management, quality and metrics, • Software processes, • Software architecture, modelling, specification, design and programming • Functional and non-functional software requirements • Software testing and verification & validation • Empirical studies of all aspects of engineering and managing software development Short Communications is a new section dedicated to short papers addressing new ideas, controversial opinions, "Negative" results and much more. Read the Guide for authors for more information. The journal encourages and welcomes submissions of systematic literature studies (reviews and maps) within the scope of the journal. Information and Software Technology is the premiere outlet for systematic literature studies in software engineering.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信