Automatic Code Generation Techniques: A Systematic Literature Review

IF 3.1 2区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Automated Software Engineering Pub Date : 2025-09-12 DOI:10.1007/s10515-025-00551-3

Maha Alharbi, Mohammad Alshayeb

{"title":"Automatic Code Generation Techniques: A Systematic Literature Review","authors":"Maha Alharbi, Mohammad Alshayeb","doi":"10.1007/s10515-025-00551-3","DOIUrl":null,"url":null,"abstract":"<div><p>As modern software systems become complex and the demand for rapid development cycles increases, automatic code generation techniques have attained a prominent focus in academic research and industrial practice. These techniques can significantly reduce human error, increase productivity, and ensure consistency across large codebases. However, the task of generating code automatically presents significant challenges. In this study, we investigate, identify, and analyze the existing automatic techniques for generating code from various input formats, highlighting their efficiencies and areas for potential improvement. A Systematic Literature Review (SLR) is conducted to systematically summarize and review 76 primary studies related to automatic code generation in the software engineering domain. The selected studies are investigated from several dimensions: paradigms, techniques, input types, intermediate representations, tool support, targeted programming languages, and validation methods, including performance metrics, datasets, and benchmarking status. Our investigation identified 12 main techniques, categorized into five paradigms, where the Model-to-Code paradigm and model-driven techniques are the most prevalent. Notably, 57% of the studies utilized Java, and a limited number of studies showed multilingual support. Furthermore, 72% of the selected studies did not compare their results with existing techniques, and 17% lacked validation of the proposed techniques. We also noticed a lack of detailed information about the datasets used in the validation process, where 52% of the studies omitted these details. This SLR provides several recommendations to enhance methodological rigor in future research, and it highlights opportunities for leveraging emerging technologies to improve the efficiency of the identified automatic code generation techniques.</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"33 1","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2025-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Automated Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10515-025-00551-3","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

As modern software systems become complex and the demand for rapid development cycles increases, automatic code generation techniques have attained a prominent focus in academic research and industrial practice. These techniques can significantly reduce human error, increase productivity, and ensure consistency across large codebases. However, the task of generating code automatically presents significant challenges. In this study, we investigate, identify, and analyze the existing automatic techniques for generating code from various input formats, highlighting their efficiencies and areas for potential improvement. A Systematic Literature Review (SLR) is conducted to systematically summarize and review 76 primary studies related to automatic code generation in the software engineering domain. The selected studies are investigated from several dimensions: paradigms, techniques, input types, intermediate representations, tool support, targeted programming languages, and validation methods, including performance metrics, datasets, and benchmarking status. Our investigation identified 12 main techniques, categorized into five paradigms, where the Model-to-Code paradigm and model-driven techniques are the most prevalent. Notably, 57% of the studies utilized Java, and a limited number of studies showed multilingual support. Furthermore, 72% of the selected studies did not compare their results with existing techniques, and 17% lacked validation of the proposed techniques. We also noticed a lack of detailed information about the datasets used in the validation process, where 52% of the studies omitted these details. This SLR provides several recommendations to enhance methodological rigor in future research, and it highlights opportunities for leveraging emerging technologies to improve the efficiency of the identified automatic code generation techniques.

查看原文本刊更多论文

自动代码生成技术：系统的文献综述

随着现代软件系统的复杂化和对快速开发周期的需求增加，自动代码生成技术在学术研究和工业实践中得到了突出的关注。这些技术可以显著减少人为错误，提高生产力，并确保大型代码库之间的一致性。然而，自动生成代码的任务带来了巨大的挑战。在本研究中，我们调查、识别和分析了现有的用于从各种输入格式生成代码的自动技术，强调了它们的效率和潜在改进的领域。系统文献综述（SLR）对软件工程领域中与自动代码生成相关的76项主要研究进行了系统总结和回顾。所选的研究从几个方面进行了调查：范式、技术、输入类型、中间表示、工具支持、目标编程语言和验证方法，包括性能指标、数据集和基准状态。我们的调查确定了12种主要技术，分为5种范式，其中模型到代码范式和模型驱动技术最为流行。值得注意的是，57%的研究使用了Java，而有限数量的研究显示了多语言支持。此外，72%的入选研究没有将其结果与现有技术进行比较，17%的研究缺乏对拟议技术的验证。我们还注意到缺乏验证过程中使用的数据集的详细信息，其中52%的研究省略了这些细节。该SLR提供了一些建议，以增强未来研究方法的严谨性，并强调了利用新兴技术来提高已确定的自动代码生成技术的效率的机会。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Automated Software Engineering 工程技术-计算机：软件工程

CiteScore

4.80

自引率

11.80%

发文量

审稿时长

>12 weeks

期刊介绍： This journal details research, tutorial papers, survey and accounts of significant industrial experience in the foundations, techniques, tools and applications of automated software engineering technology. This includes the study of techniques for constructing, understanding, adapting, and modeling software artifacts and processes. Coverage in Automated Software Engineering examines both automatic systems and collaborative systems as well as computational models of human software engineering activities. In addition, it presents knowledge representations and artificial intelligence techniques applicable to automated software engineering, and formal techniques that support or provide theoretical foundations. The journal also includes reviews of books, software, conferences and workshops.