MCAD-EUC: Multi-context adaptive decoding with entropy-based uncertainty calibration for knowledge conflict mitigation

IF 7.5 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Expert Systems with Applications Pub Date : 2025-09-19 DOI:10.1016/j.eswa.2025.129659

Yimin Ou , Yifan Wang , Ping Jian , Tianhe Zhang , Xing Pei

{"title":"MCAD-EUC: Multi-context adaptive decoding with entropy-based uncertainty calibration for knowledge conflict mitigation","authors":"Yimin Ou , Yifan Wang , Ping Jian , Tianhe Zhang , Xing Pei","doi":"10.1016/j.eswa.2025.129659","DOIUrl":null,"url":null,"abstract":"<div><div>The knowledge sources of large language models (LLMs) encompass both parametric internal knowledge and external contextual information. However, conflicts between these two sources can significantly impair model performance. Existing methods typically assume a priori correctness of either the context or the parametric knowledge, lacking dynamic coordination mechanisms and being limited to single-context scenarios. To address this issue, this work proposes a lightweight and training-free decoding method, <strong>M</strong>ulti-<strong>C</strong>ontext <strong>A</strong>daptive <strong>D</strong>ecoding (<strong>MCAD-EUC</strong>), which dynamically measures the effectiveness of both knowledge through <strong>E</strong>ntropy based <strong>U</strong>ncertainty <strong>C</strong>alibration. It does not concern itself with whether the knowledge is false or true, the internal or the external, but balancing them according to their contributions to correctly answering the question. Particularly, MCAD-EUC is naturally multi-contextual. It can dynamically amplify the distribution of golden context while mitigating the influence of noisy context, thereby optimizing the final logits for predicting the next token during the decoding process. To comprehensively evaluate the model performance in multi-context scenarios, this work constructs MCQA, a multi-context question answering dataset that includes golden context, irrelevant context, and six categories of misleading context (crowd, logic, temporal, authority, emotional, numeric), simulating the diversity of noise in real-world settings. Extensive experiments on four LLMs and four MCQA datasets demonstrate that MCAD-EUC achieves an average accuracy improvement of 3.17 % over the best-performing baseline methods. Further sensitivity analysis confirms that the entropy-based adaptive weighting mechanism consistently outperforms all fixed-weight settings. Our dataset and code will be publicly available.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"298 ","pages":"Article 129659"},"PeriodicalIF":7.5000,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425032749","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

The knowledge sources of large language models (LLMs) encompass both parametric internal knowledge and external contextual information. However, conflicts between these two sources can significantly impair model performance. Existing methods typically assume a priori correctness of either the context or the parametric knowledge, lacking dynamic coordination mechanisms and being limited to single-context scenarios. To address this issue, this work proposes a lightweight and training-free decoding method, Multi-Context Adaptive Decoding (MCAD-EUC), which dynamically measures the effectiveness of both knowledge through Entropy based Uncertainty Calibration. It does not concern itself with whether the knowledge is false or true, the internal or the external, but balancing them according to their contributions to correctly answering the question. Particularly, MCAD-EUC is naturally multi-contextual. It can dynamically amplify the distribution of golden context while mitigating the influence of noisy context, thereby optimizing the final logits for predicting the next token during the decoding process. To comprehensively evaluate the model performance in multi-context scenarios, this work constructs MCQA, a multi-context question answering dataset that includes golden context, irrelevant context, and six categories of misleading context (crowd, logic, temporal, authority, emotional, numeric), simulating the diversity of noise in real-world settings. Extensive experiments on four LLMs and four MCQA datasets demonstrate that MCAD-EUC achieves an average accuracy improvement of 3.17 % over the best-performing baseline methods. Further sensitivity analysis confirms that the entropy-based adaptive weighting mechanism consistently outperforms all fixed-weight settings. Our dataset and code will be publicly available.

查看原文本刊更多论文

MCAD-EUC：基于熵的不确定性校正的知识冲突缓解多上下文自适应解码

大型语言模型的知识来源包括参数化的内部知识和外部上下文信息。然而，这两个来源之间的冲突会严重损害模型的性能。现有方法通常假设上下文或参数知识的先验正确性，缺乏动态协调机制，并且仅限于单一上下文场景。为了解决这一问题，本研究提出了一种轻量级且无需训练的解码方法——多上下文自适应解码（Multi-Context Adaptive decoding, MCAD-EUC），该方法通过基于熵的不确定性校准动态测量两种知识的有效性。它并不关心知识是真是假，是内部的还是外部的，而是根据它们对正确回答问题的贡献来平衡它们。特别是，MCAD-EUC自然是多上下文的。它可以动态放大黄金上下文的分布，同时减轻噪声上下文的影响，从而优化解码过程中预测下一个令牌的最终逻辑。为了全面评估模型在多上下文场景中的性能，本工作构建了MCQA，这是一个多上下文问答数据集，包括黄金上下文、不相关上下文和六类误导性上下文（人群、逻辑、时间、权威、情感、数字），模拟了现实环境中噪声的多样性。在4个llm和4个MCQA数据集上进行的大量实验表明，MCAD-EUC比性能最好的基线方法平均精度提高了3.17%。进一步的敏感性分析证实，基于熵的自适应加权机制始终优于所有固定权重设置。我们的数据集和代码将是公开的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Expert Systems with Applications 工程技术-工程：电子与电气

CiteScore

13.80

自引率

10.60%

发文量

2045

审稿时长

8.7 months

期刊介绍： Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.