Authors' reply: Re: Koga et al. Retrieval-augmented generation versus document-grounded generation: a key distinction in large language models

IF 3.4 2区 医学 Q1 PATHOLOGY
Katherine J Hewitt, Isabella C Wiest, Jakob N Kather
{"title":"Authors' reply: Re: Koga et al. Retrieval-augmented generation versus document-grounded generation: a key distinction in large language models","authors":"Katherine J Hewitt,&nbsp;Isabella C Wiest,&nbsp;Jakob N Kather","doi":"10.1002/2056-4538.70013","DOIUrl":null,"url":null,"abstract":"<p>We thank Koga <i>et al</i> for their knowledgeable comments on our work. Their letter highlights a valid question that requires clarification [<span>1</span>].</p><p>Our study assessed the ability of three large language models (LLMs) to diagnose neuropathology cases from free-text descriptions of adult-type diffuse gliomas, for which we compared two methodologies. The first method provided each model with the free-text tumor descriptions alone, while the second approach additionally provided the models with a Word document of the WHO CNS5. We termed these approaches zero-shot and retrieval-augmented generation (RAG), respectively [<span>2</span>]. Koga <i>et al</i> point out that the methodology we describe in our paper as RAG, may be better described as document-grounded generation, or in-context learning.</p><p>While we agree with the definition of RAG provided in the letter as it was initially defined [<span>3</span>], the field has evolved significantly since the approach was first proposed by Lewis <i>et al</i> in 2020. Three paradigms of RAG are now increasingly recognized: naive RAG, advanced RAG, and modular RAG [<span>4</span>]. Naive RAG is an approach where the data for indexing are generally obtained offline and converted into a format such as PDF or Word, and uploaded with the query via the context window. Advanced RAG and modular RAG offer specific improvements to address the limitations of naive RAG; however, to achieve this, they utilize more technical approaches.</p><p>The intention for our paper was to use naive RAG. We chose this approach as it leverages the easiest possible way for improving an LLM response that would be reproducible by doctors, considering that most doctors would be unable to utilize the application programming interface and programmatically build a RAG pipeline. As discussed by Koga <i>et al</i>, the key difference between naive RAG and document-grounding lies in how the document is utilized when the model retrieves its response [<span>5</span>]. Document-grounding submits the document with the user query and is equivalent to inserting the entire document text into the context window [<span>5</span>]. Whereas with naive RAG, relevant parts of the document are identified by the model and used with the query to dynamically search its database [<span>4</span>]. Both approaches are examples of in-context learning as they acquire additional knowledge from the prompt without requiring parameter updates [<span>6</span>].</p><p>Bereft of transparency from the LLM providers regarding how they process the document once it has been submitted via the graphical user interface, it is difficult to know whether naive RAG or document-grounding was used to formulate a response. To our knowledge, details regarding how appended documents are utilized during a query are not freely available online by ChatGPT, Llama, or Claude. Furthermore, due to the speed of development in the field, technical aspects of how documents are utilized may have changed since our experiments were conducted earlier this year. Nonetheless, we contacted ChatGPT, Anthropic, and Poe for assistance in clarifying this point. All three providers confirmed that documents uploaded with a query are used for RAG. However, the responses from ChatGPT and Anthropic were both generated by bots, demonstrating the need for more reliable and greater transparency about the actual technical methods used.</p><p>We are grateful for your endorsement of our conclusions and appreciate the opportunity to address this important distinction. Further clarification and transparency are needed to definitively distinguish the mechanisms employed by specific LLM platforms, particularly regarding how appended documents are processed. We proffer that our work uses in-context learning, as both RAG and document-grounding are methods of this broader paradigm. Nevertheless, we remain committed to clarifying this matter and thank Koga <i>et al</i> for their engagement and valuable input.</p><p>KJH wrote the first draft; review and critique; and final approval. ICW review and critique; and final approval. JNK review and critique; and final approval.</p>","PeriodicalId":48612,"journal":{"name":"Journal of Pathology Clinical Research","volume":"11 1","pages":""},"PeriodicalIF":3.4000,"publicationDate":"2025-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11747990/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Pathology Clinical Research","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/2056-4538.70013","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PATHOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

We thank Koga et al for their knowledgeable comments on our work. Their letter highlights a valid question that requires clarification [1].

Our study assessed the ability of three large language models (LLMs) to diagnose neuropathology cases from free-text descriptions of adult-type diffuse gliomas, for which we compared two methodologies. The first method provided each model with the free-text tumor descriptions alone, while the second approach additionally provided the models with a Word document of the WHO CNS5. We termed these approaches zero-shot and retrieval-augmented generation (RAG), respectively [2]. Koga et al point out that the methodology we describe in our paper as RAG, may be better described as document-grounded generation, or in-context learning.

While we agree with the definition of RAG provided in the letter as it was initially defined [3], the field has evolved significantly since the approach was first proposed by Lewis et al in 2020. Three paradigms of RAG are now increasingly recognized: naive RAG, advanced RAG, and modular RAG [4]. Naive RAG is an approach where the data for indexing are generally obtained offline and converted into a format such as PDF or Word, and uploaded with the query via the context window. Advanced RAG and modular RAG offer specific improvements to address the limitations of naive RAG; however, to achieve this, they utilize more technical approaches.

The intention for our paper was to use naive RAG. We chose this approach as it leverages the easiest possible way for improving an LLM response that would be reproducible by doctors, considering that most doctors would be unable to utilize the application programming interface and programmatically build a RAG pipeline. As discussed by Koga et al, the key difference between naive RAG and document-grounding lies in how the document is utilized when the model retrieves its response [5]. Document-grounding submits the document with the user query and is equivalent to inserting the entire document text into the context window [5]. Whereas with naive RAG, relevant parts of the document are identified by the model and used with the query to dynamically search its database [4]. Both approaches are examples of in-context learning as they acquire additional knowledge from the prompt without requiring parameter updates [6].

Bereft of transparency from the LLM providers regarding how they process the document once it has been submitted via the graphical user interface, it is difficult to know whether naive RAG or document-grounding was used to formulate a response. To our knowledge, details regarding how appended documents are utilized during a query are not freely available online by ChatGPT, Llama, or Claude. Furthermore, due to the speed of development in the field, technical aspects of how documents are utilized may have changed since our experiments were conducted earlier this year. Nonetheless, we contacted ChatGPT, Anthropic, and Poe for assistance in clarifying this point. All three providers confirmed that documents uploaded with a query are used for RAG. However, the responses from ChatGPT and Anthropic were both generated by bots, demonstrating the need for more reliable and greater transparency about the actual technical methods used.

We are grateful for your endorsement of our conclusions and appreciate the opportunity to address this important distinction. Further clarification and transparency are needed to definitively distinguish the mechanisms employed by specific LLM platforms, particularly regarding how appended documents are processed. We proffer that our work uses in-context learning, as both RAG and document-grounding are methods of this broader paradigm. Nevertheless, we remain committed to clarifying this matter and thank Koga et al for their engagement and valuable input.

KJH wrote the first draft; review and critique; and final approval. ICW review and critique; and final approval. JNK review and critique; and final approval.

作者回复:Re: Koga et al。检索增强生成与基于文档的生成:大型语言模型中的关键区别。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Pathology Clinical Research
Journal of Pathology Clinical Research Medicine-Pathology and Forensic Medicine
CiteScore
7.40
自引率
2.40%
发文量
47
审稿时长
20 weeks
期刊介绍: The Journal of Pathology: Clinical Research and The Journal of Pathology serve as translational bridges between basic biomedical science and clinical medicine with particular emphasis on, but not restricted to, tissue based studies. The focus of The Journal of Pathology: Clinical Research is the publication of studies that illuminate the clinical relevance of research in the broad area of the study of disease. Appropriately powered and validated studies with novel diagnostic, prognostic and predictive significance, and biomarker discover and validation, will be welcomed. Studies with a predominantly mechanistic basis will be more appropriate for the companion Journal of Pathology.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信