{"title":"A systematic exploration of C-to-rust code translation based on large language models: prompt strategies and automated repair","authors":"Ruxin Zhang, Shanxin Zhang, Linbo Xie","doi":"10.1007/s10515-025-00570-0","DOIUrl":null,"url":null,"abstract":"<div><p>C is widely used in system programming due to its low-level flexibility. However, as demands for memory safety and code reliability grow, Rust has become a more favorable alternative owing to its modern design principles. Migrating existing C code to Rust has therefore emerged as a key approach for enhancing the security and maintainability of software systems. Nevertheless, automating such migrations remains challenging due to fundamental differences between the two languages in terms of language design philosophy, type systems, and levels of abstraction. Most current code transformation tools focus on mappings of basic data types and syntactic replacements, such as handling pointers or conversion of lock mechanisms. These approaches often fail to deeply model the semantic features and programming paradigms of the target language. To address this limitation, this paper proposes RustFlow, a C-to-Rust code translation framework based on large language models (LLMs), designed to generate idiomatic and semantically accurate Rust code. This framework employs a multi-stage progressive architecture, which decomposes the overall translation task into several sequential stages, namely translation, validation, and repair. During the translation phase, a collaborative prompting strategy is employed to guide the LLM in achieving cross-language semantic alignment, thereby improving the accuracy of the generated code. Subsequently, a validation mechanism is introduced to perform syntactic and semantic checks on the generated output, and a conversational iterative repair strategy is employed to further enhance the quality of the final result. Experimental results show that RustFlow outperforms most of the latest baseline approaches, achieving an average improvement of 50.67% in translation performance compared to the base LLM. This work offers a novel technical approach and practical support for efficient and reliable cross-language code migration.</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"33 1","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2025-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Automated Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10515-025-00570-0","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
C is widely used in system programming due to its low-level flexibility. However, as demands for memory safety and code reliability grow, Rust has become a more favorable alternative owing to its modern design principles. Migrating existing C code to Rust has therefore emerged as a key approach for enhancing the security and maintainability of software systems. Nevertheless, automating such migrations remains challenging due to fundamental differences between the two languages in terms of language design philosophy, type systems, and levels of abstraction. Most current code transformation tools focus on mappings of basic data types and syntactic replacements, such as handling pointers or conversion of lock mechanisms. These approaches often fail to deeply model the semantic features and programming paradigms of the target language. To address this limitation, this paper proposes RustFlow, a C-to-Rust code translation framework based on large language models (LLMs), designed to generate idiomatic and semantically accurate Rust code. This framework employs a multi-stage progressive architecture, which decomposes the overall translation task into several sequential stages, namely translation, validation, and repair. During the translation phase, a collaborative prompting strategy is employed to guide the LLM in achieving cross-language semantic alignment, thereby improving the accuracy of the generated code. Subsequently, a validation mechanism is introduced to perform syntactic and semantic checks on the generated output, and a conversational iterative repair strategy is employed to further enhance the quality of the final result. Experimental results show that RustFlow outperforms most of the latest baseline approaches, achieving an average improvement of 50.67% in translation performance compared to the base LLM. This work offers a novel technical approach and practical support for efficient and reliable cross-language code migration.
期刊介绍:
This journal details research, tutorial papers, survey and accounts of significant industrial experience in the foundations, techniques, tools and applications of automated software engineering technology. This includes the study of techniques for constructing, understanding, adapting, and modeling software artifacts and processes.
Coverage in Automated Software Engineering examines both automatic systems and collaborative systems as well as computational models of human software engineering activities. In addition, it presents knowledge representations and artificial intelligence techniques applicable to automated software engineering, and formal techniques that support or provide theoretical foundations. The journal also includes reviews of books, software, conferences and workshops.