检索增强生成与基于文档的生成：大型语言模型中的关键区别。

IF 3.4 2区医学 Q1 PATHOLOGY

Journal of Pathology Clinical Research Pub Date : 2025-01-16 DOI:10.1002/2056-4538.70014

Shunsuke Koga, Daisuke Ono, Amrom Obstfeld

{"title":"检索增强生成与基于文档的生成：大型语言模型中的关键区别。","authors":"Shunsuke Koga, Daisuke Ono, Amrom Obstfeld","doi":"10.1002/2056-4538.70014","DOIUrl":null,"url":null,"abstract":"We read with great interest the article by Hewitt et al., ‘Large language models as a diagnostic support tool in neuropathology’ [1]. The authors effectively applied large language models (LLMs) to interpreting the WHO classification of central nervous system tumors; however, we wish to address a technical aspect of their study that warrants clarification.The authors described their approach as retrieval-augmented generation (RAG). Based on the methods described, the study involved attaching a Word document containing the WHO diagnostic criteria to the prompt to guide its responses. We believe that this approach is more accurately described as document-grounded generation rather than true RAG. Document-grounded generation refers to methods where the model generates outputs explicitly based on a preprovided document, which serves as a static reference [2]. Unlike RAG, which retrieves information dynamically from external sources [3], document-grounded generation relies entirely on data embedded in the input prompt at the time of execution. In this study, the WHO criteria were provided with the prompt, which allowed the model to use this information without real-time retrieval. This method is a type of in-context learning, relying on curated contextual data embedded in the input [4].Our own work provides an example of in-context learning in a different domain, namely image classification. We evaluated GPT-4 Vision (GPT-4V) for classifying histopathological images stained with tau immunohistochemistry, including neuritic plaques, astrocytic plaques, and tufted astrocytes [5]. Although GPT-4V initially struggled, few-shot learning with annotated examples, which is a specific application of in-context learning, significantly improved its accuracy, matching that of a convolutional neural network model trained on a larger dataset. These findings demonstrate the utility of in-context learning for both text-based and image-based tasks, with the latter presenting unique challenges for LLMs [6].Although in-context learning is an effective approach, it has limitations worth considering. Since this method uses static data that are preloaded data into the prompt, errors can occur if the information is outdated or inaccurate. In-context learning may also lead to overfitting to the given context, limiting the model's ability to generalize to other scenarios. If the contextual data are overly complex, the model might misinterpret the information or fail to generate accurate outputs [7]. To ensure reliability, it is important to carefully select the input data, update it regularly, and consider these limitations when designing tasks.In summary, clarifying the differences between RAG, document-grounded generation, and in-context learning is essential, especially for readers less familiar with these concepts. Nonetheless, we support the authors' conclusion that incorporating external data improves diagnostic performance. Their study, interpreted as an example of document-grounded generation, demonstrates how LLMs can effectively assist in medical tasks when supported by well-curated contextual data.SK: conceptualization, drafting the manuscript. DO: reviewing and editing the manuscript. AO: reviewing and editing the manuscript.No conflicts of interest were declared.","PeriodicalId":48612,"journal":{"name":"Journal of Pathology Clinical Research","volume":"11 1","pages":""},"PeriodicalIF":3.4000,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11736412/pdf/","citationCount":"0","resultStr":"{\"title\":\"Retrieval-augmented generation versus document-grounded generation: a key distinction in large language models\",\"authors\":\"Shunsuke Koga, Daisuke Ono, Amrom Obstfeld\",\"doi\":\"10.1002/2056-4538.70014\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We read with great interest the article by Hewitt et al., ‘Large language models as a diagnostic support tool in neuropathology’ [1]. The authors effectively applied large language models (LLMs) to interpreting the WHO classification of central nervous system tumors; however, we wish to address a technical aspect of their study that warrants clarification.The authors described their approach as retrieval-augmented generation (RAG). Based on the methods described, the study involved attaching a Word document containing the WHO diagnostic criteria to the prompt to guide its responses. We believe that this approach is more accurately described as document-grounded generation rather than true RAG. Document-grounded generation refers to methods where the model generates outputs explicitly based on a preprovided document, which serves as a static reference [2]. Unlike RAG, which retrieves information dynamically from external sources [3], document-grounded generation relies entirely on data embedded in the input prompt at the time of execution. In this study, the WHO criteria were provided with the prompt, which allowed the model to use this information without real-time retrieval. This method is a type of in-context learning, relying on curated contextual data embedded in the input [4].Our own work provides an example of in-context learning in a different domain, namely image classification. We evaluated GPT-4 Vision (GPT-4V) for classifying histopathological images stained with tau immunohistochemistry, including neuritic plaques, astrocytic plaques, and tufted astrocytes [5]. Although GPT-4V initially struggled, few-shot learning with annotated examples, which is a specific application of in-context learning, significantly improved its accuracy, matching that of a convolutional neural network model trained on a larger dataset. These findings demonstrate the utility of in-context learning for both text-based and image-based tasks, with the latter presenting unique challenges for LLMs [6].Although in-context learning is an effective approach, it has limitations worth considering. Since this method uses static data that are preloaded data into the prompt, errors can occur if the information is outdated or inaccurate. In-context learning may also lead to overfitting to the given context, limiting the model's ability to generalize to other scenarios. If the contextual data are overly complex, the model might misinterpret the information or fail to generate accurate outputs [7]. To ensure reliability, it is important to carefully select the input data, update it regularly, and consider these limitations when designing tasks.In summary, clarifying the differences between RAG, document-grounded generation, and in-context learning is essential, especially for readers less familiar with these concepts. Nonetheless, we support the authors' conclusion that incorporating external data improves diagnostic performance. Their study, interpreted as an example of document-grounded generation, demonstrates how LLMs can effectively assist in medical tasks when supported by well-curated contextual data.SK: conceptualization, drafting the manuscript. DO: reviewing and editing the manuscript. AO: reviewing and editing the manuscript.No conflicts of interest were declared.\",\"PeriodicalId\":48612,\"journal\":{\"name\":\"Journal of Pathology Clinical Research\",\"volume\":\"11 1\",\"pages\":\"\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2025-01-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11736412/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Pathology Clinical Research\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/2056-4538.70014\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"PATHOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Pathology Clinical Research","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/2056-4538.70014","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PATHOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

我们非常感兴趣地阅读了Hewitt等人的文章，“大型语言模型作为神经病理学的诊断支持工具”。作者有效地应用大语言模型（LLMs）来解释WHO对中枢神经系统肿瘤的分类；然而，我们希望处理他们的研究中值得澄清的一个技术方面。作者将他们的方法描述为检索增强生成（RAG）。根据所描述的方法，该研究涉及将包含世卫组织诊断标准的Word文档附加到提示框中，以指导其响应。我们认为这种方法更准确地描述为基于文档的生成，而不是真正的RAG。基于文档的生成是指模型根据预先提供的文档显式生成输出的方法，该文档用作静态引用[2]。与从外部源[3]动态检索信息的RAG不同，基于文档的生成完全依赖于执行时嵌入在输入提示符中的数据。在本研究中，为WHO标准提供了提示，允许模型使用这些信息而无需实时检索。这种方法是一种上下文学习，依赖于嵌入在输入[4]中的精心策划的上下文数据。我们自己的工作提供了一个不同领域的上下文学习的例子，即图像分类。我们评估了GPT-4 Vision （GPT-4V）对tau免疫组化染色的组织病理学图像的分类，包括神经斑块、星形细胞斑块和簇状星形细胞[5]。尽管GPT-4V最初遇到了困难，但带有注释示例的少数镜头学习（这是上下文学习的特定应用）显着提高了其准确性，与在更大数据集上训练的卷积神经网络模型相匹配。这些发现证明了上下文学习在基于文本和基于图像的任务中的效用，后者对法学硕士b[6]提出了独特的挑战。虽然情境学习是一种有效的学习方法，但它也有值得考虑的局限性。由于此方法使用预加载到提示符中的静态数据，因此如果信息过时或不准确，可能会发生错误。情境学习也可能导致对给定情境的过度拟合，限制了模型推广到其他场景的能力。如果上下文数据过于复杂，则模型可能会误解信息或无法生成准确的输出[7]。为了确保可靠性，必须仔细选择输入数据，定期更新数据，并在设计任务时考虑这些限制。总之，澄清RAG、基于文档的生成和上下文学习之间的区别是必要的，特别是对于不太熟悉这些概念的读者。尽管如此，我们支持作者的结论，即纳入外部数据可以提高诊断性能。他们的研究被解释为基于文档生成的一个例子，展示了法学硕士如何在精心策划的上下文数据的支持下有效地协助医疗任务。概念化，起草手稿。应该：审阅和编辑稿件。AO：审阅和编辑稿件。没有宣布利益冲突。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Retrieval-augmented generation versus document-grounded generation: a key distinction in large language models

We read with great interest the article by Hewitt et al., ‘Large language models as a diagnostic support tool in neuropathology’ [1]. The authors effectively applied large language models (LLMs) to interpreting the WHO classification of central nervous system tumors; however, we wish to address a technical aspect of their study that warrants clarification.

The authors described their approach as retrieval-augmented generation (RAG). Based on the methods described, the study involved attaching a Word document containing the WHO diagnostic criteria to the prompt to guide its responses. We believe that this approach is more accurately described as document-grounded generation rather than true RAG. Document-grounded generation refers to methods where the model generates outputs explicitly based on a preprovided document, which serves as a static reference [2]. Unlike RAG, which retrieves information dynamically from external sources [3], document-grounded generation relies entirely on data embedded in the input prompt at the time of execution. In this study, the WHO criteria were provided with the prompt, which allowed the model to use this information without real-time retrieval. This method is a type of in-context learning, relying on curated contextual data embedded in the input [4].

Our own work provides an example of in-context learning in a different domain, namely image classification. We evaluated GPT-4 Vision (GPT-4V) for classifying histopathological images stained with tau immunohistochemistry, including neuritic plaques, astrocytic plaques, and tufted astrocytes [5]. Although GPT-4V initially struggled, few-shot learning with annotated examples, which is a specific application of in-context learning, significantly improved its accuracy, matching that of a convolutional neural network model trained on a larger dataset. These findings demonstrate the utility of in-context learning for both text-based and image-based tasks, with the latter presenting unique challenges for LLMs [6].

Although in-context learning is an effective approach, it has limitations worth considering. Since this method uses static data that are preloaded data into the prompt, errors can occur if the information is outdated or inaccurate. In-context learning may also lead to overfitting to the given context, limiting the model's ability to generalize to other scenarios. If the contextual data are overly complex, the model might misinterpret the information or fail to generate accurate outputs [7]. To ensure reliability, it is important to carefully select the input data, update it regularly, and consider these limitations when designing tasks.

In summary, clarifying the differences between RAG, document-grounded generation, and in-context learning is essential, especially for readers less familiar with these concepts. Nonetheless, we support the authors' conclusion that incorporating external data improves diagnostic performance. Their study, interpreted as an example of document-grounded generation, demonstrates how LLMs can effectively assist in medical tasks when supported by well-curated contextual data.

SK: conceptualization, drafting the manuscript. DO: reviewing and editing the manuscript. AO: reviewing and editing the manuscript.

No conflicts of interest were declared.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Pathology Clinical Research Medicine-Pathology and Forensic Medicine

CiteScore

7.40

自引率

2.40%

发文量

审稿时长

20 weeks

期刊介绍： The Journal of Pathology: Clinical Research and The Journal of Pathology serve as translational bridges between basic biomedical science and clinical medicine with particular emphasis on, but not restricted to, tissue based studies. The focus of The Journal of Pathology: Clinical Research is the publication of studies that illuminate the clinical relevance of research in the broad area of the study of disease. Appropriately powered and validated studies with novel diagnostic, prognostic and predictive significance, and biomarker discover and validation, will be welcomed. Studies with a predominantly mechanistic basis will be more appropriate for the companion Journal of Pathology.