{"title":"FEDS-ICL: Enhancing translation ability and efficiency of large language model by optimizing demonstration selection","authors":"Shaolin Zhu, Leiyu Pan, Deyi Xiong","doi":"10.1016/j.ipm.2024.103825","DOIUrl":null,"url":null,"abstract":"<div><p>Large language models (LLMs) that exhibit a remarkable ability by in-context learning (ICL) with bilingual demonstrations have been recognized as a potential solution for machine translation. However, the process of selecting these demonstrations from vast datastores is notoriously time-consuming and inefficient. Moreover, the strategies for designing effective in-context demonstrations are not well-established. To address these critical gaps, we introduce a novel Fast and Effective approach for Demonstration Selection in-Context learning (FEDS-ICL) tailored to LLMs. Our method is designed to mainly enhance the efficiency and accuracy of translation of LLMs. Our approach revolutionizes demonstration selection by designing new product quantization technique that rapidly extracts neighboring target tokens from a strategically curated subset of sentences. This method significantly deviates from the conventional exhaustive search across entire datastores, leading to a remarkable increase in speed. Furthermore, FEDS-ICL pioneers an innovative template design for in-context demonstrations, specifically crafted to amplify the translation capabilities of multilingual LLMs. In experiments, we compare our FEDS-ICL with various existing methods on across diverse language pairs on ten different LLMs. The results reveal an up to 2.1-fold increase in selection speed and an impressive enhancement in translation accuracy, outperforming existing baselines by up to 2.0 BLEU points at least on ten different LLMs. The ablation study show the proposed product quantization and multi-view demonstration can effectively enhance the efficiency and accuracy of LLMs in machine translation. The analysis on robustness of FEDS-ICL shows that the incorporation of a greater number of demonstrations can lead a positive correlation between the quantity of contextually rich demonstrations and the translation quality of LLMs. These advancements position FEDS-ICL as a transformative methodology in the domain of machine translation and pattern analysis, marking a significant leap towards more efficient and precise machine translation.</p></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":null,"pages":null},"PeriodicalIF":7.4000,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing & Management","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306457324001845","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Large language models (LLMs) that exhibit a remarkable ability by in-context learning (ICL) with bilingual demonstrations have been recognized as a potential solution for machine translation. However, the process of selecting these demonstrations from vast datastores is notoriously time-consuming and inefficient. Moreover, the strategies for designing effective in-context demonstrations are not well-established. To address these critical gaps, we introduce a novel Fast and Effective approach for Demonstration Selection in-Context learning (FEDS-ICL) tailored to LLMs. Our method is designed to mainly enhance the efficiency and accuracy of translation of LLMs. Our approach revolutionizes demonstration selection by designing new product quantization technique that rapidly extracts neighboring target tokens from a strategically curated subset of sentences. This method significantly deviates from the conventional exhaustive search across entire datastores, leading to a remarkable increase in speed. Furthermore, FEDS-ICL pioneers an innovative template design for in-context demonstrations, specifically crafted to amplify the translation capabilities of multilingual LLMs. In experiments, we compare our FEDS-ICL with various existing methods on across diverse language pairs on ten different LLMs. The results reveal an up to 2.1-fold increase in selection speed and an impressive enhancement in translation accuracy, outperforming existing baselines by up to 2.0 BLEU points at least on ten different LLMs. The ablation study show the proposed product quantization and multi-view demonstration can effectively enhance the efficiency and accuracy of LLMs in machine translation. The analysis on robustness of FEDS-ICL shows that the incorporation of a greater number of demonstrations can lead a positive correlation between the quantity of contextually rich demonstrations and the translation quality of LLMs. These advancements position FEDS-ICL as a transformative methodology in the domain of machine translation and pattern analysis, marking a significant leap towards more efficient and precise machine translation.
期刊介绍:
Information Processing and Management is dedicated to publishing cutting-edge original research at the convergence of computing and information science. Our scope encompasses theory, methods, and applications across various domains, including advertising, business, health, information science, information technology marketing, and social computing.
We aim to cater to the interests of both primary researchers and practitioners by offering an effective platform for the timely dissemination of advanced and topical issues in this interdisciplinary field. The journal places particular emphasis on original research articles, research survey articles, research method articles, and articles addressing critical applications of research. Join us in advancing knowledge and innovation at the intersection of computing and information science.