Retrieval-augmented generation improves precision and trust of a GPT-4 model for emergency radiology diagnosis and classification: a proof-of-concept study.
IF 4.7 2区 医学Q1 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING
Anna Fink, Johanna Nattenmüller, Stephan Rau, Alexander Rau, Hien Tran, Fabian Bamberg, Marco Reisert, Elmar Kotter, Thierno Diallo, Maximilian F Russe
{"title":"Retrieval-augmented generation improves precision and trust of a GPT-4 model for emergency radiology diagnosis and classification: a proof-of-concept study.","authors":"Anna Fink, Johanna Nattenmüller, Stephan Rau, Alexander Rau, Hien Tran, Fabian Bamberg, Marco Reisert, Elmar Kotter, Thierno Diallo, Maximilian F Russe","doi":"10.1007/s00330-025-11445-z","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>This study evaluated the effect of enhancing a GPT-4 model with retrieval-augmented generation on its ability to diagnose and classify traumatic injuries based on radiology reports.</p><p><strong>Materials and methods: </strong>In this prospective proof-of-concept study, we used retrieval-augmented generation as a zero-shot learning approach to provide expert knowledge from the RadioGraphics top ten reading list for trauma radiology to the GPT-4 model, creating the context-aware TraumaCB. Radiological report findings of 50 traumatic injuries were independently generated by two radiologists. The performance of the TraumaCB compared to the generic GPT-4 was evaluated by three board-certified radiologists, assessing the accuracy and trustworthiness of the chatbot responses in the 100 reports created.</p><p><strong>Results: </strong>The TraumaCB achieved 100% correct diagnoses, 96% correct classification, and 87% correct grading, outperforming the generic GPT-4 with 93% correct diagnoses, 70% correct classification, and 48% correct grading. TraumaCB sources consistently achieved a median rating of 5.0 for explanation and trust. Challenges encountered mainly involved traumatic injuries lacking widely accepted classification systems.</p><p><strong>Conclusion: </strong>Augmenting a commercial GPT-4 model with retrieval-augmented generation improves its diagnostic and classification capabilities, positioning it as a valuable tool for efficiently assessing traumatic injuries across various anatomical regions in trauma radiology.</p><p><strong>Key points: </strong>Question Retrieval-augmented generation has the potential to enhance generic chatbots with task-specific knowledge of emergency radiology. Findings The TraumaCB excelled in accuracy, particularly in injury classification and grading, and provided explanations along with the sources used, increasing transparency and facilitating verification. Clinical relevance The TraumaCB provides accurate, fast, and transparent access to trauma radiology classifications, potentially increasing the efficiency of image interpretation in emergency departments and enabling customized reports based on local or individual preferences.</p>","PeriodicalId":12076,"journal":{"name":"European Radiology","volume":" ","pages":""},"PeriodicalIF":4.7000,"publicationDate":"2025-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Radiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s00330-025-11445-z","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0
Abstract
Objectives: This study evaluated the effect of enhancing a GPT-4 model with retrieval-augmented generation on its ability to diagnose and classify traumatic injuries based on radiology reports.
Materials and methods: In this prospective proof-of-concept study, we used retrieval-augmented generation as a zero-shot learning approach to provide expert knowledge from the RadioGraphics top ten reading list for trauma radiology to the GPT-4 model, creating the context-aware TraumaCB. Radiological report findings of 50 traumatic injuries were independently generated by two radiologists. The performance of the TraumaCB compared to the generic GPT-4 was evaluated by three board-certified radiologists, assessing the accuracy and trustworthiness of the chatbot responses in the 100 reports created.
Results: The TraumaCB achieved 100% correct diagnoses, 96% correct classification, and 87% correct grading, outperforming the generic GPT-4 with 93% correct diagnoses, 70% correct classification, and 48% correct grading. TraumaCB sources consistently achieved a median rating of 5.0 for explanation and trust. Challenges encountered mainly involved traumatic injuries lacking widely accepted classification systems.
Conclusion: Augmenting a commercial GPT-4 model with retrieval-augmented generation improves its diagnostic and classification capabilities, positioning it as a valuable tool for efficiently assessing traumatic injuries across various anatomical regions in trauma radiology.
Key points: Question Retrieval-augmented generation has the potential to enhance generic chatbots with task-specific knowledge of emergency radiology. Findings The TraumaCB excelled in accuracy, particularly in injury classification and grading, and provided explanations along with the sources used, increasing transparency and facilitating verification. Clinical relevance The TraumaCB provides accurate, fast, and transparent access to trauma radiology classifications, potentially increasing the efficiency of image interpretation in emergency departments and enabling customized reports based on local or individual preferences.
期刊介绍:
European Radiology (ER) continuously updates scientific knowledge in radiology by publication of strong original articles and state-of-the-art reviews written by leading radiologists. A well balanced combination of review articles, original papers, short communications from European radiological congresses and information on society matters makes ER an indispensable source for current information in this field.
This is the Journal of the European Society of Radiology, and the official journal of a number of societies.
From 2004-2008 supplements to European Radiology were published under its companion, European Radiology Supplements, ISSN 1613-3749.