Yue Pan , Chen Lyu , Zhenyu Yang , Lantian Li , Qi Liu , Xiuting Shao
{"title":"电子编码:通过预训练模型和专家编码器小组掌握高效代码生成","authors":"Yue Pan , Chen Lyu , Zhenyu Yang , Lantian Li , Qi Liu , Xiuting Shao","doi":"10.1016/j.infsof.2024.107602","DOIUrl":null,"url":null,"abstract":"<div><h3>Context:</h3><div>With the waning of Moore’s Law, the software industry is placing increasing importance on finding alternative solutions for continuous performance enhancement. The significance and research results of software performance optimization have been on the rise in recent years, especially with the advancement propelled by <strong>L</strong>arge <strong>L</strong>anguage <strong>M</strong>odel<strong>s</strong> (LLMs). However, traditional strategies for rectifying performance flaws have shown significant limitations at the competitive code efficiency optimization level, and research on this topic is surprisingly scarce.</div></div><div><h3>Objective:</h3><div>This study aims to address the research gap in this domain, offering practical solutions to the various challenges encountered. Specifically, we have overcome the constraints of traditional performance error rectification strategies and developed a <strong>L</strong>anguage <strong>M</strong>odel (LM) tailored for the competitive code efficiency optimization realm.</div></div><div><h3>Methods:</h3><div>We introduced E-code, an advanced program synthesis LM. Inspired by the recent success of expert LMs, we designed an innovative structure called the Expert Encoder Group. This structure employs multiple expert encoders to extract features tailored for different input types. We assessed the performance of E-code against other leading models on a competitive dataset and conducted in-depth ablation experiments.</div></div><div><h3>Results:</h3><div>Upon systematic evaluation, E-code achieved a 54.98% improvement in code efficiency, significantly outperforming other advanced models. In the ablation experiments, we further validated the significance of the expert encoder group and other components within E-code.</div></div><div><h3>Conclusion:</h3><div>The research findings indicate that the expert encoder group can effectively handle various inputs in efficiency optimization tasks, significantly enhancing the model’s performance. In summary, this study paves new avenues for developing systems and methods to assist programmers in writing efficient code.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"178 ","pages":"Article 107602"},"PeriodicalIF":3.8000,"publicationDate":"2024-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"E-code: Mastering efficient code generation through pretrained models and expert encoder group\",\"authors\":\"Yue Pan , Chen Lyu , Zhenyu Yang , Lantian Li , Qi Liu , Xiuting Shao\",\"doi\":\"10.1016/j.infsof.2024.107602\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Context:</h3><div>With the waning of Moore’s Law, the software industry is placing increasing importance on finding alternative solutions for continuous performance enhancement. The significance and research results of software performance optimization have been on the rise in recent years, especially with the advancement propelled by <strong>L</strong>arge <strong>L</strong>anguage <strong>M</strong>odel<strong>s</strong> (LLMs). However, traditional strategies for rectifying performance flaws have shown significant limitations at the competitive code efficiency optimization level, and research on this topic is surprisingly scarce.</div></div><div><h3>Objective:</h3><div>This study aims to address the research gap in this domain, offering practical solutions to the various challenges encountered. Specifically, we have overcome the constraints of traditional performance error rectification strategies and developed a <strong>L</strong>anguage <strong>M</strong>odel (LM) tailored for the competitive code efficiency optimization realm.</div></div><div><h3>Methods:</h3><div>We introduced E-code, an advanced program synthesis LM. Inspired by the recent success of expert LMs, we designed an innovative structure called the Expert Encoder Group. This structure employs multiple expert encoders to extract features tailored for different input types. We assessed the performance of E-code against other leading models on a competitive dataset and conducted in-depth ablation experiments.</div></div><div><h3>Results:</h3><div>Upon systematic evaluation, E-code achieved a 54.98% improvement in code efficiency, significantly outperforming other advanced models. In the ablation experiments, we further validated the significance of the expert encoder group and other components within E-code.</div></div><div><h3>Conclusion:</h3><div>The research findings indicate that the expert encoder group can effectively handle various inputs in efficiency optimization tasks, significantly enhancing the model’s performance. In summary, this study paves new avenues for developing systems and methods to assist programmers in writing efficient code.</div></div>\",\"PeriodicalId\":54983,\"journal\":{\"name\":\"Information and Software Technology\",\"volume\":\"178 \",\"pages\":\"Article 107602\"},\"PeriodicalIF\":3.8000,\"publicationDate\":\"2024-10-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information and Software Technology\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0950584924002076\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information and Software Technology","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950584924002076","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
E-code: Mastering efficient code generation through pretrained models and expert encoder group
Context:
With the waning of Moore’s Law, the software industry is placing increasing importance on finding alternative solutions for continuous performance enhancement. The significance and research results of software performance optimization have been on the rise in recent years, especially with the advancement propelled by Large Language Models (LLMs). However, traditional strategies for rectifying performance flaws have shown significant limitations at the competitive code efficiency optimization level, and research on this topic is surprisingly scarce.
Objective:
This study aims to address the research gap in this domain, offering practical solutions to the various challenges encountered. Specifically, we have overcome the constraints of traditional performance error rectification strategies and developed a Language Model (LM) tailored for the competitive code efficiency optimization realm.
Methods:
We introduced E-code, an advanced program synthesis LM. Inspired by the recent success of expert LMs, we designed an innovative structure called the Expert Encoder Group. This structure employs multiple expert encoders to extract features tailored for different input types. We assessed the performance of E-code against other leading models on a competitive dataset and conducted in-depth ablation experiments.
Results:
Upon systematic evaluation, E-code achieved a 54.98% improvement in code efficiency, significantly outperforming other advanced models. In the ablation experiments, we further validated the significance of the expert encoder group and other components within E-code.
Conclusion:
The research findings indicate that the expert encoder group can effectively handle various inputs in efficiency optimization tasks, significantly enhancing the model’s performance. In summary, this study paves new avenues for developing systems and methods to assist programmers in writing efficient code.
期刊介绍:
Information and Software Technology is the international archival journal focusing on research and experience that contributes to the improvement of software development practices. The journal''s scope includes methods and techniques to better engineer software and manage its development. Articles submitted for review should have a clear component of software engineering or address ways to improve the engineering and management of software development. Areas covered by the journal include:
• Software management, quality and metrics,
• Software processes,
• Software architecture, modelling, specification, design and programming
• Functional and non-functional software requirements
• Software testing and verification & validation
• Empirical studies of all aspects of engineering and managing software development
Short Communications is a new section dedicated to short papers addressing new ideas, controversial opinions, "Negative" results and much more. Read the Guide for authors for more information.
The journal encourages and welcomes submissions of systematic literature studies (reviews and maps) within the scope of the journal. Information and Software Technology is the premiere outlet for systematic literature studies in software engineering.