{"title":"Enhancing Code Transformation in Large Language Models Through Retrieval-Augmented Fine-Tuning","authors":"Jing-Ming Guo;Po-Yang Liu;Yi-Chong Zeng;Ting-Ju Chen","doi":"10.1109/TCE.2025.3565294","DOIUrl":null,"url":null,"abstract":"Large language models (LLMs) have made substantial advancements in knowledge reasoning and are increasingly utilized in specialized domains such as code completion, legal analysis, and medical transcription, where accuracy is paramount. In such applications, document-specific precision is more critical than general reasoning capabilities. This paper proposes a novel approach based on Retrieval-Augmented Fine-Tuning (RAFT) to enhance model-generated outputs, particularly in code transformation tasks. RAFT integrates domain-specific knowledge, optimizing in-domain retrieval-augmented generation by training the model to discern the relationship between prompts, retrieved documents, and target outputs. This enables the model to extract relevant information while mitigating the impact of noise. Experimental results demonstrate that the proposed method improves accuracy of 2.4% and CodeBLEU of 1.3% for VB-to-C# code conversion, highlighting its effectiveness in domain-specific applications.","PeriodicalId":13208,"journal":{"name":"IEEE Transactions on Consumer Electronics","volume":"71 1","pages":"2342-2346"},"PeriodicalIF":4.3000,"publicationDate":"2025-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Consumer Electronics","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10979988/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Large language models (LLMs) have made substantial advancements in knowledge reasoning and are increasingly utilized in specialized domains such as code completion, legal analysis, and medical transcription, where accuracy is paramount. In such applications, document-specific precision is more critical than general reasoning capabilities. This paper proposes a novel approach based on Retrieval-Augmented Fine-Tuning (RAFT) to enhance model-generated outputs, particularly in code transformation tasks. RAFT integrates domain-specific knowledge, optimizing in-domain retrieval-augmented generation by training the model to discern the relationship between prompts, retrieved documents, and target outputs. This enables the model to extract relevant information while mitigating the impact of noise. Experimental results demonstrate that the proposed method improves accuracy of 2.4% and CodeBLEU of 1.3% for VB-to-C# code conversion, highlighting its effectiveness in domain-specific applications.
期刊介绍:
The main focus for the IEEE Transactions on Consumer Electronics is the engineering and research aspects of the theory, design, construction, manufacture or end use of mass market electronics, systems, software and services for consumers.