A collaborative reasoning framework for large language models in long-context Q&A

IF 7.5 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Expert Systems with Applications Pub Date : 2025-10-09 DOI:10.1016/j.eswa.2025.129960

Jiacheng Yao, Guoxiu He, Xin Xu

{"title":"A collaborative reasoning framework for large language models in long-context Q&A","authors":"Jiacheng Yao, Guoxiu He, Xin Xu","doi":"10.1016/j.eswa.2025.129960","DOIUrl":null,"url":null,"abstract":"<div><div>Large Language Models (LLMs) often struggle with the <em>Lost in the Middle</em> phenomenon in long-context question answering (Q&A). Existing solutions, such as modifying attention mechanisms or positional encodings, typically require retraining, which demands substantial computational resources. Other strategies, including long-term memory mechanisms and context processing, heavily rely on auxiliary components and fail to fundamentally enhance the LLM’s reasoning capabilities. To bridge this gap, this paper proposes a novel collaborative reasoning framework. Initially, the framework uses a retrieval-augmented generation (RAG) approach to generate a candidate answer from sentences relevant to the input question. Subsequently, a training-free Shadow-LLM is designed to supplement local sentence-level information from the long-context during the reasoning process to produce another candidate answer. Finally, a <em>one-out-of-two</em> selection strategy chooses the final answer based on the two candidates. Experiments on three long-context Q&A datasets and three backbone LLMs show that our method raises the F1 score over the baselines by 2% to 18%. Notably, we find that activating only the 0th decoder layer of the LLM is sufficient for Shadow-LLM to operate at optimal performance, enabling efficient deployment without retraining. The code is available at <span><span>link</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"299 ","pages":"Article 129960"},"PeriodicalIF":7.5000,"publicationDate":"2025-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425035754","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Large Language Models (LLMs) often struggle with the Lost in the Middle phenomenon in long-context question answering (Q&A). Existing solutions, such as modifying attention mechanisms or positional encodings, typically require retraining, which demands substantial computational resources. Other strategies, including long-term memory mechanisms and context processing, heavily rely on auxiliary components and fail to fundamentally enhance the LLM’s reasoning capabilities. To bridge this gap, this paper proposes a novel collaborative reasoning framework. Initially, the framework uses a retrieval-augmented generation (RAG) approach to generate a candidate answer from sentences relevant to the input question. Subsequently, a training-free Shadow-LLM is designed to supplement local sentence-level information from the long-context during the reasoning process to produce another candidate answer. Finally, a one-out-of-two selection strategy chooses the final answer based on the two candidates. Experiments on three long-context Q&A datasets and three backbone LLMs show that our method raises the F1 score over the baselines by 2% to 18%. Notably, we find that activating only the 0th decoder layer of the LLM is sufficient for Shadow-LLM to operate at optimal performance, enabling efficient deployment without retraining. The code is available at link.

查看原文本刊更多论文

用于长上下文问答的大型语言模型的协作推理框架

大型语言模型（llm）在长上下文问答（Q&；A）中经常与迷失在中间的现象作斗争。现有的解决方案，如修改注意力机制或位置编码，通常需要重新训练，这需要大量的计算资源。其他策略，包括长期记忆机制和上下文处理，严重依赖辅助组件，不能从根本上提高LLM的推理能力。为了弥补这一差距，本文提出了一种新的协作推理框架。最初，该框架使用检索增强生成（RAG）方法从与输入问题相关的句子中生成候选答案。随后，设计一个无需训练的Shadow-LLM，在推理过程中补充来自长上下文的局部句子级信息，以产生另一个候选答案。最后，二选一的选择策略根据两个候选人选择最终答案。在三个长上下文Q&；A数据集和三个主干llm上的实验表明，我们的方法将基线上的F1分数提高了2%到18%。值得注意的是，我们发现仅激活LLM的第0解码器层就足以使Shadow-LLM以最佳性能运行，从而无需重新训练即可实现高效部署。代码可从链接获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Expert Systems with Applications 工程技术-工程：电子与电气

CiteScore

13.80

自引率

10.60%

发文量

2045

审稿时长

8.7 months

期刊介绍： Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.