{"title":"Retrieval-augmented generation versus document-grounded generation: a key distinction in large language models","authors":"Shunsuke Koga, Daisuke Ono, Amrom Obstfeld","doi":"10.1002/2056-4538.70014","DOIUrl":null,"url":null,"abstract":"<p>We read with great interest the article by Hewitt <i>et al</i>., ‘Large language models as a diagnostic support tool in neuropathology’ [<span>1</span>]. The authors effectively applied large language models (LLMs) to interpreting the WHO classification of central nervous system tumors; however, we wish to address a technical aspect of their study that warrants clarification.</p><p>The authors described their approach as retrieval-augmented generation (RAG). Based on the methods described, the study involved attaching a Word document containing the WHO diagnostic criteria to the prompt to guide its responses. We believe that this approach is more accurately described as document-grounded generation rather than true RAG. Document-grounded generation refers to methods where the model generates outputs explicitly based on a preprovided document, which serves as a static reference [<span>2</span>]. Unlike RAG, which retrieves information dynamically from external sources [<span>3</span>], document-grounded generation relies entirely on data embedded in the input prompt at the time of execution. In this study, the WHO criteria were provided with the prompt, which allowed the model to use this information without real-time retrieval. This method is a type of in-context learning, relying on curated contextual data embedded in the input [<span>4</span>].</p><p>Our own work provides an example of in-context learning in a different domain, namely image classification. We evaluated GPT-4 Vision (GPT-4V) for classifying histopathological images stained with tau immunohistochemistry, including neuritic plaques, astrocytic plaques, and tufted astrocytes [<span>5</span>]. Although GPT-4V initially struggled, few-shot learning with annotated examples, which is a specific application of in-context learning, significantly improved its accuracy, matching that of a convolutional neural network model trained on a larger dataset. These findings demonstrate the utility of in-context learning for both text-based and image-based tasks, with the latter presenting unique challenges for LLMs [<span>6</span>].</p><p>Although in-context learning is an effective approach, it has limitations worth considering. Since this method uses static data that are preloaded data into the prompt, errors can occur if the information is outdated or inaccurate. In-context learning may also lead to overfitting to the given context, limiting the model's ability to generalize to other scenarios. If the contextual data are overly complex, the model might misinterpret the information or fail to generate accurate outputs [<span>7</span>]. To ensure reliability, it is important to carefully select the input data, update it regularly, and consider these limitations when designing tasks.</p><p>In summary, clarifying the differences between RAG, document-grounded generation, and in-context learning is essential, especially for readers less familiar with these concepts. Nonetheless, we support the authors' conclusion that incorporating external data improves diagnostic performance. Their study, interpreted as an example of document-grounded generation, demonstrates how LLMs can effectively assist in medical tasks when supported by well-curated contextual data.</p><p>SK: conceptualization, drafting the manuscript. DO: reviewing and editing the manuscript. AO: reviewing and editing the manuscript.</p><p>No conflicts of interest were declared.</p>","PeriodicalId":48612,"journal":{"name":"Journal of Pathology Clinical Research","volume":"11 1","pages":""},"PeriodicalIF":3.4000,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11736412/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Pathology Clinical Research","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/2056-4538.70014","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PATHOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
We read with great interest the article by Hewitt et al., ‘Large language models as a diagnostic support tool in neuropathology’ [1]. The authors effectively applied large language models (LLMs) to interpreting the WHO classification of central nervous system tumors; however, we wish to address a technical aspect of their study that warrants clarification.
The authors described their approach as retrieval-augmented generation (RAG). Based on the methods described, the study involved attaching a Word document containing the WHO diagnostic criteria to the prompt to guide its responses. We believe that this approach is more accurately described as document-grounded generation rather than true RAG. Document-grounded generation refers to methods where the model generates outputs explicitly based on a preprovided document, which serves as a static reference [2]. Unlike RAG, which retrieves information dynamically from external sources [3], document-grounded generation relies entirely on data embedded in the input prompt at the time of execution. In this study, the WHO criteria were provided with the prompt, which allowed the model to use this information without real-time retrieval. This method is a type of in-context learning, relying on curated contextual data embedded in the input [4].
Our own work provides an example of in-context learning in a different domain, namely image classification. We evaluated GPT-4 Vision (GPT-4V) for classifying histopathological images stained with tau immunohistochemistry, including neuritic plaques, astrocytic plaques, and tufted astrocytes [5]. Although GPT-4V initially struggled, few-shot learning with annotated examples, which is a specific application of in-context learning, significantly improved its accuracy, matching that of a convolutional neural network model trained on a larger dataset. These findings demonstrate the utility of in-context learning for both text-based and image-based tasks, with the latter presenting unique challenges for LLMs [6].
Although in-context learning is an effective approach, it has limitations worth considering. Since this method uses static data that are preloaded data into the prompt, errors can occur if the information is outdated or inaccurate. In-context learning may also lead to overfitting to the given context, limiting the model's ability to generalize to other scenarios. If the contextual data are overly complex, the model might misinterpret the information or fail to generate accurate outputs [7]. To ensure reliability, it is important to carefully select the input data, update it regularly, and consider these limitations when designing tasks.
In summary, clarifying the differences between RAG, document-grounded generation, and in-context learning is essential, especially for readers less familiar with these concepts. Nonetheless, we support the authors' conclusion that incorporating external data improves diagnostic performance. Their study, interpreted as an example of document-grounded generation, demonstrates how LLMs can effectively assist in medical tasks when supported by well-curated contextual data.
SK: conceptualization, drafting the manuscript. DO: reviewing and editing the manuscript. AO: reviewing and editing the manuscript.
期刊介绍:
The Journal of Pathology: Clinical Research and The Journal of Pathology serve as translational bridges between basic biomedical science and clinical medicine with particular emphasis on, but not restricted to, tissue based studies.
The focus of The Journal of Pathology: Clinical Research is the publication of studies that illuminate the clinical relevance of research in the broad area of the study of disease. Appropriately powered and validated studies with novel diagnostic, prognostic and predictive significance, and biomarker discover and validation, will be welcomed. Studies with a predominantly mechanistic basis will be more appropriate for the companion Journal of Pathology.