基于对比解码算法的大语言模型在低资源语言中的幻觉缓解

IF 7.3 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

CAAI Transactions on Intelligence Technology Pub Date : 2025-04-03 DOI:10.1049/cit2.70004

Zan Hongying, Arifa Javed, Muhammad Abdullah, Javed Rashid, Muhammad Faheem

{"title":"基于对比解码算法的大语言模型在低资源语言中的幻觉缓解","authors":"Zan Hongying, Arifa Javed, Muhammad Abdullah, Javed Rashid, Muhammad Faheem","doi":"10.1049/cit2.70004","DOIUrl":null,"url":null,"abstract":"<p>Neural machine translation (NMT) has advanced with deep learning and large-scale multilingual models, yet translating low-resource languages often lacks sufficient training data and leads to hallucinations. This often results in translated content that diverges significantly from the source text. This research proposes a refined Contrastive Decoding (CD) algorithm that dynamically adjusts weights of log probabilities from strong expert and weak amateur models to mitigate hallucinations in low-resource NMT and improve translation quality. Advanced large language NMT models, including ChatGLM and LLaMA, are fine-tuned and implemented for their superior contextual understanding and cross-lingual capabilities. The refined CD algorithm evaluates multiple candidate translations using BLEU score, semantic similarity, and Named Entity Recognition accuracy. Extensive experimental results show substantial improvements in translation quality and a significant reduction in hallucination rates. Fine-tuned models achieve higher evaluation metrics compared to baseline models and state-of-the-art models. An ablation study confirms the contributions of each methodological component and highlights the effectiveness of the refined CD algorithm and advanced models in mitigating hallucinations. Notably, the refined methodology increased the BLEU score by approximately 30% compared to baseline models.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 4","pages":"1104-1117"},"PeriodicalIF":7.3000,"publicationDate":"2025-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70004","citationCount":"0","resultStr":"{\"title\":\"Large Language Models With Contrastive Decoding Algorithm for Hallucination Mitigation in Low-Resource Languages\",\"authors\":\"Zan Hongying, Arifa Javed, Muhammad Abdullah, Javed Rashid, Muhammad Faheem\",\"doi\":\"10.1049/cit2.70004\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Neural machine translation (NMT) has advanced with deep learning and large-scale multilingual models, yet translating low-resource languages often lacks sufficient training data and leads to hallucinations. This often results in translated content that diverges significantly from the source text. This research proposes a refined Contrastive Decoding (CD) algorithm that dynamically adjusts weights of log probabilities from strong expert and weak amateur models to mitigate hallucinations in low-resource NMT and improve translation quality. Advanced large language NMT models, including ChatGLM and LLaMA, are fine-tuned and implemented for their superior contextual understanding and cross-lingual capabilities. The refined CD algorithm evaluates multiple candidate translations using BLEU score, semantic similarity, and Named Entity Recognition accuracy. Extensive experimental results show substantial improvements in translation quality and a significant reduction in hallucination rates. Fine-tuned models achieve higher evaluation metrics compared to baseline models and state-of-the-art models. An ablation study confirms the contributions of each methodological component and highlights the effectiveness of the refined CD algorithm and advanced models in mitigating hallucinations. Notably, the refined methodology increased the BLEU score by approximately 30% compared to baseline models.</p>\",\"PeriodicalId\":46211,\"journal\":{\"name\":\"CAAI Transactions on Intelligence Technology\",\"volume\":\"10 4\",\"pages\":\"1104-1117\"},\"PeriodicalIF\":7.3000,\"publicationDate\":\"2025-04-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70004\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"CAAI Transactions on Intelligence Technology\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ietresearch.onlinelibrary.wiley.com/doi/10.1049/cit2.70004\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"CAAI Transactions on Intelligence Technology","FirstCategoryId":"94","ListUrlMain":"https://ietresearch.onlinelibrary.wiley.com/doi/10.1049/cit2.70004","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

神经机器翻译（NMT）随着深度学习和大规模多语言模型的发展而进步，但翻译低资源语言往往缺乏足够的训练数据并导致幻觉。这通常会导致翻译的内容与原文有很大的差异。本文提出了一种改进的对比解码（CD）算法，该算法动态调整来自强专家和弱业余模型的对数概率权重，以减轻低资源NMT中的幻觉，提高翻译质量。先进的大型语言NMT模型，包括ChatGLM和LLaMA，经过微调和实现，具有卓越的上下文理解和跨语言能力。改进的CD算法使用BLEU评分、语义相似度和命名实体识别准确性来评估多个候选翻译。大量的实验结果表明，翻译质量有了很大的提高，幻觉率显著降低。与基线模型和最先进的模型相比，微调模型实现了更高的评估度量。消融研究证实了每个方法学组成部分的贡献，并强调了改进的CD算法和先进模型在减轻幻觉方面的有效性。值得注意的是，与基线模型相比，改进的方法将BLEU评分提高了约30%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Large Language Models With Contrastive Decoding Algorithm for Hallucination Mitigation in Low-Resource Languages

查看原文本刊更多论文

Large Language Models With Contrastive Decoding Algorithm for Hallucination Mitigation in Low-Resource Languages

Neural machine translation (NMT) has advanced with deep learning and large-scale multilingual models, yet translating low-resource languages often lacks sufficient training data and leads to hallucinations. This often results in translated content that diverges significantly from the source text. This research proposes a refined Contrastive Decoding (CD) algorithm that dynamically adjusts weights of log probabilities from strong expert and weak amateur models to mitigate hallucinations in low-resource NMT and improve translation quality. Advanced large language NMT models, including ChatGLM and LLaMA, are fine-tuned and implemented for their superior contextual understanding and cross-lingual capabilities. The refined CD algorithm evaluates multiple candidate translations using BLEU score, semantic similarity, and Named Entity Recognition accuracy. Extensive experimental results show substantial improvements in translation quality and a significant reduction in hallucination rates. Fine-tuned models achieve higher evaluation metrics compared to baseline models and state-of-the-art models. An ablation study confirms the contributions of each methodological component and highlights the effectiveness of the refined CD algorithm and advanced models in mitigating hallucinations. Notably, the refined methodology increased the BLEU score by approximately 30% compared to baseline models.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

CAAI Transactions on Intelligence Technology COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-

CiteScore

11.00

自引率

3.90%

发文量

134

审稿时长

35 weeks

期刊介绍： CAAI Transactions on Intelligence Technology is a leading venue for original research on the theoretical and experimental aspects of artificial intelligence technology. We are a fully open access journal co-published by the Institution of Engineering and Technology (IET) and the Chinese Association for Artificial Intelligence (CAAI) providing research which is openly accessible to read and share worldwide.