通过与放射学指南的语言模型对齐来评估真实世界患者病例的急性图像排序。

IF 5.4 Q1 MEDICINE, RESEARCH & EXPERIMENTAL

Communications medicine Pub Date : 2025-08-04 DOI:10.1038/s43856-025-01061-9

Michael S Yao, Allison Chae, Piya Saraiya, Charles E Kahn, Walter R Witschey, James C Gee, Hersh Sagreiya, Osbert Bastani

{"title":"通过与放射学指南的语言模型对齐来评估真实世界患者病例的急性图像排序。","authors":"Michael S Yao, Allison Chae, Piya Saraiya, Charles E Kahn, Walter R Witschey, James C Gee, Hersh Sagreiya, Osbert Bastani","doi":"10.1038/s43856-025-01061-9","DOIUrl":null,"url":null,"abstract":"Background: Diagnostic imaging studies are increasingly important in the management of acutely presenting patients. However, ordering appropriate imaging studies in the emergency department is a challenging task with a high degree of variability among healthcare providers. To address this issue, recent work has investigated whether generative AI and large language models can be leveraged to recommend diagnostic imaging studies in accordance with evidence-based medical guidelines. However, it remains challenging to ensure that these tools can provide recommendations that correctly align with medical guidelines, especially given the limited diagnostic information available in acute care settings.Methods: In this study, we introduce a framework to intelligently leverage language models by recommending imaging studies for patient cases that align with the American College of Radiology's Appropriateness Criteria, a set of evidence-based guidelines. To power our experiments, we introduce RadCases, a dataset of over 1500 annotated case summaries reflecting common patient presentations, and apply our framework to enable state-of-the-art language models to reason about appropriate imaging choices.Results: Using our framework, state-of-the-art language models achieve accuracy comparable to clinicians in ordering imaging studies. Furthermore, we demonstrate that our language model-based pipeline can be used as an intelligent assistant by clinicians to support image ordering workflows and improve the accuracy of acute image ordering according to the American College of Radiology's Appropriateness Criteria.Conclusions: Our work demonstrates and validates a strategy to leverage AI-based software to improve trustworthy clinical decision-making in alignment with expert evidence-based guidelines.","PeriodicalId":72646,"journal":{"name":"Communications medicine","volume":"5 1","pages":"332"},"PeriodicalIF":5.4000,"publicationDate":"2025-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12322208/pdf/","citationCount":"0","resultStr":"{\"title\":\"Evaluating acute image ordering for real-world patient cases via language model alignment with radiological guidelines.\",\"authors\":\"Michael S Yao, Allison Chae, Piya Saraiya, Charles E Kahn, Walter R Witschey, James C Gee, Hersh Sagreiya, Osbert Bastani\",\"doi\":\"10.1038/s43856-025-01061-9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: Diagnostic imaging studies are increasingly important in the management of acutely presenting patients. However, ordering appropriate imaging studies in the emergency department is a challenging task with a high degree of variability among healthcare providers. To address this issue, recent work has investigated whether generative AI and large language models can be leveraged to recommend diagnostic imaging studies in accordance with evidence-based medical guidelines. However, it remains challenging to ensure that these tools can provide recommendations that correctly align with medical guidelines, especially given the limited diagnostic information available in acute care settings.Methods: In this study, we introduce a framework to intelligently leverage language models by recommending imaging studies for patient cases that align with the American College of Radiology's Appropriateness Criteria, a set of evidence-based guidelines. To power our experiments, we introduce RadCases, a dataset of over 1500 annotated case summaries reflecting common patient presentations, and apply our framework to enable state-of-the-art language models to reason about appropriate imaging choices.Results: Using our framework, state-of-the-art language models achieve accuracy comparable to clinicians in ordering imaging studies. Furthermore, we demonstrate that our language model-based pipeline can be used as an intelligent assistant by clinicians to support image ordering workflows and improve the accuracy of acute image ordering according to the American College of Radiology's Appropriateness Criteria.Conclusions: Our work demonstrates and validates a strategy to leverage AI-based software to improve trustworthy clinical decision-making in alignment with expert evidence-based guidelines.\",\"PeriodicalId\":72646,\"journal\":{\"name\":\"Communications medicine\",\"volume\":\"5 1\",\"pages\":\"332\"},\"PeriodicalIF\":5.4000,\"publicationDate\":\"2025-08-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12322208/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Communications medicine\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1038/s43856-025-01061-9\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MEDICINE, RESEARCH & EXPERIMENTAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Communications medicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1038/s43856-025-01061-9","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICINE, RESEARCH & EXPERIMENTAL","Score":null,"Total":0}

引用次数: 0

摘要

背景：诊断性影像学研究在急性表现患者的治疗中越来越重要。然而，在急诊科进行适当的影像学检查是一项具有挑战性的任务，医疗保健提供者之间存在高度的可变性。为了解决这一问题，最近的工作调查了是否可以根据循证医学指南利用生成人工智能和大型语言模型来推荐诊断成像研究。然而，确保这些工具能够提供与医疗指南正确一致的建议仍然具有挑战性，特别是考虑到急性护理环境中可用的诊断信息有限。方法：在本研究中，我们引入了一个框架，通过推荐符合美国放射学会适当标准（一套循证指南）的患者病例的影像学研究，智能地利用语言模型。为了支持我们的实验，我们引入了RadCases，这是一个包含1500多个带注释的病例摘要的数据集，反映了常见的患者表现，并应用我们的框架使最先进的语言模型能够推理出适当的成像选择。结果：使用我们的框架，最先进的语言模型在排序成像研究中达到了与临床医生相当的准确性。此外，我们证明了我们基于语言模型的管道可以被临床医生用作智能助手，以支持图像排序工作流程，并根据美国放射学院的适当性标准提高急性图像排序的准确性。结论：我们的工作证明并验证了一种利用基于人工智能的软件来改善可信赖的临床决策的策略，与专家循证指南保持一致。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Evaluating acute image ordering for real-world patient cases via language model alignment with radiological guidelines.

Background: Diagnostic imaging studies are increasingly important in the management of acutely presenting patients. However, ordering appropriate imaging studies in the emergency department is a challenging task with a high degree of variability among healthcare providers. To address this issue, recent work has investigated whether generative AI and large language models can be leveraged to recommend diagnostic imaging studies in accordance with evidence-based medical guidelines. However, it remains challenging to ensure that these tools can provide recommendations that correctly align with medical guidelines, especially given the limited diagnostic information available in acute care settings.

Methods: In this study, we introduce a framework to intelligently leverage language models by recommending imaging studies for patient cases that align with the American College of Radiology's Appropriateness Criteria, a set of evidence-based guidelines. To power our experiments, we introduce RadCases, a dataset of over 1500 annotated case summaries reflecting common patient presentations, and apply our framework to enable state-of-the-art language models to reason about appropriate imaging choices.

Results: Using our framework, state-of-the-art language models achieve accuracy comparable to clinicians in ordering imaging studies. Furthermore, we demonstrate that our language model-based pipeline can be used as an intelligent assistant by clinicians to support image ordering workflows and improve the accuracy of acute image ordering according to the American College of Radiology's Appropriateness Criteria.

Conclusions: Our work demonstrates and validates a strategy to leverage AI-based software to improve trustworthy clinical decision-making in alignment with expert evidence-based guidelines.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Communications medicine

自引率

0.00%

发文量