结构化的临床推理提示提高了LLM在诊断请测案例中的诊断能力。

IF 2.1 4区医学

Japanese Journal of Radiology Pub Date : 2025-04-01 Epub Date: 2024-12-03 DOI:10.1007/s11604-024-01712-2

Yuki Sonoda, Ryo Kurokawa, Akifumi Hagiwara, Yusuke Asari, Takahiro Fukushima, Jun Kanzawa, Wataru Gonoi, Osamu Abe

{"title":"结构化的临床推理提示提高了LLM在诊断请测案例中的诊断能力。","authors":"Yuki Sonoda, Ryo Kurokawa, Akifumi Hagiwara, Yusuke Asari, Takahiro Fukushima, Jun Kanzawa, Wataru Gonoi, Osamu Abe","doi":"10.1007/s11604-024-01712-2","DOIUrl":null,"url":null,"abstract":"Purpose: Large Language Models (LLMs) show promise in medical diagnosis, but their performance varies with prompting. Recent studies suggest that modifying prompts may enhance diagnostic capabilities. This study aimed to test whether a prompting approach that aligns with general clinical reasoning methodology-specifically, using a standardized template to first organize clinical information into predefined categories (patient information, history, symptoms, examinations, etc.) before making diagnoses, instead of one-step processing-can enhance the LLM's medical diagnostic capabilities.Materials and methods: Three hundred twenty two quiz questions from Radiology's Diagnosis Please cases (1998-2023) were used. We employed Claude 3.5 Sonnet, a state-of-the-art LLM, to compare three approaches: (1) Baseline: conventional zero-shot chain-of-thought prompt, (2) two-step approach: structured two-step approach: first, the LLM systematically organizes clinical information into two distinct categories (patient history and imaging findings), then separately analyzes this organized information to provide diagnoses, and (3) Summary-only approach: using only the LLM-generated summary for diagnoses.Results: The two-step approach significantly outperformed the both baseline and summary-only approaches in diagnostic accuracy, as determined by McNemar's test. Primary diagnostic accuracy was 60.6% for the two-step approach, compared to 56.5% for baseline (p = 0.042) and 56.3% for summary-only (p = 0.035). For the top three diagnoses, accuracy was 70.5, 66.5, and 65.5% respectively (p = 0.005 for baseline, p = 0.008 for summary-only). No significant differences were observed between the baseline and summary-only approaches.Conclusion: Our results indicate that a structured clinical reasoning approach enhances LLM's diagnostic accuracy. This method shows potential as a valuable tool for deriving diagnoses from free-text clinical information. The approach aligns well with established clinical reasoning processes, suggesting its potential applicability in real-world clinical settings.","PeriodicalId":14691,"journal":{"name":"Japanese Journal of Radiology","volume":" ","pages":"586-592"},"PeriodicalIF":2.1000,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11953165/pdf/","citationCount":"0","resultStr":"{\"title\":\"Structured clinical reasoning prompt enhances LLM's diagnostic capabilities in diagnosis please quiz cases.\",\"authors\":\"Yuki Sonoda, Ryo Kurokawa, Akifumi Hagiwara, Yusuke Asari, Takahiro Fukushima, Jun Kanzawa, Wataru Gonoi, Osamu Abe\",\"doi\":\"10.1007/s11604-024-01712-2\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Purpose: Large Language Models (LLMs) show promise in medical diagnosis, but their performance varies with prompting. Recent studies suggest that modifying prompts may enhance diagnostic capabilities. This study aimed to test whether a prompting approach that aligns with general clinical reasoning methodology-specifically, using a standardized template to first organize clinical information into predefined categories (patient information, history, symptoms, examinations, etc.) before making diagnoses, instead of one-step processing-can enhance the LLM's medical diagnostic capabilities.Materials and methods: Three hundred twenty two quiz questions from Radiology's Diagnosis Please cases (1998-2023) were used. We employed Claude 3.5 Sonnet, a state-of-the-art LLM, to compare three approaches: (1) Baseline: conventional zero-shot chain-of-thought prompt, (2) two-step approach: structured two-step approach: first, the LLM systematically organizes clinical information into two distinct categories (patient history and imaging findings), then separately analyzes this organized information to provide diagnoses, and (3) Summary-only approach: using only the LLM-generated summary for diagnoses.Results: The two-step approach significantly outperformed the both baseline and summary-only approaches in diagnostic accuracy, as determined by McNemar's test. Primary diagnostic accuracy was 60.6% for the two-step approach, compared to 56.5% for baseline (p = 0.042) and 56.3% for summary-only (p = 0.035). For the top three diagnoses, accuracy was 70.5, 66.5, and 65.5% respectively (p = 0.005 for baseline, p = 0.008 for summary-only). No significant differences were observed between the baseline and summary-only approaches.Conclusion: Our results indicate that a structured clinical reasoning approach enhances LLM's diagnostic accuracy. This method shows potential as a valuable tool for deriving diagnoses from free-text clinical information. The approach aligns well with established clinical reasoning processes, suggesting its potential applicability in real-world clinical settings.\",\"PeriodicalId\":14691,\"journal\":{\"name\":\"Japanese Journal of Radiology\",\"volume\":\" \",\"pages\":\"586-592\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2025-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11953165/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Japanese Journal of Radiology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1007/s11604-024-01712-2\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/12/3 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Japanese Journal of Radiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s11604-024-01712-2","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/12/3 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

目的：大语言模型（LLMs）在医学诊断中显示出良好的应用前景，但其性能随提示的不同而不同。最近的研究表明，修改提示可能会提高诊断能力。本研究旨在测试一种与一般临床推理方法相一致的提示方法——具体来说，在进行诊断之前，使用标准化模板首先将临床信息组织成预定义的类别（患者信息、病史、症状、检查等），而不是一步处理——是否可以提高法学硕士的医学诊断能力。材料与方法：采用1998-2023年《放射学诊断手册》病例中的322道测验题。我们使用最先进的法学硕士Claude 3.5 Sonnet来比较三种方法：(1)基线：传统的零shot思维链提示；(2)两步法：结构化的两步法：首先，法学硕士系统地将临床信息组织成两种不同的类别（患者病史和影像学结果），然后分别分析这些组织信息以提供诊断；(3)仅摘要方法：仅使用法学硕士生成的摘要进行诊断。结果：两步方法在诊断准确性方面明显优于基线和仅总结方法，由McNemar的测试确定。两步法的初级诊断准确率为60.6%，而基线法为56.5% (p = 0.042)，单纯汇总法为56.3% （p = 0.035）。对于前三种诊断，准确率分别为70.5%、66.5和65.5%（基线p = 0.005，仅汇总p = 0.008）。在基线和综合方法之间没有观察到显著差异。结论：我们的结果表明，结构化的临床推理方法提高了LLM的诊断准确性。该方法显示了作为一种有价值的工具从自由文本临床信息中获得诊断的潜力。该方法与已建立的临床推理过程很好地一致，表明其在现实世界临床环境中的潜在适用性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Structured clinical reasoning prompt enhances LLM's diagnostic capabilities in diagnosis please quiz cases.

Purpose: Large Language Models (LLMs) show promise in medical diagnosis, but their performance varies with prompting. Recent studies suggest that modifying prompts may enhance diagnostic capabilities. This study aimed to test whether a prompting approach that aligns with general clinical reasoning methodology-specifically, using a standardized template to first organize clinical information into predefined categories (patient information, history, symptoms, examinations, etc.) before making diagnoses, instead of one-step processing-can enhance the LLM's medical diagnostic capabilities.

Materials and methods: Three hundred twenty two quiz questions from Radiology's Diagnosis Please cases (1998-2023) were used. We employed Claude 3.5 Sonnet, a state-of-the-art LLM, to compare three approaches: (1) Baseline: conventional zero-shot chain-of-thought prompt, (2) two-step approach: structured two-step approach: first, the LLM systematically organizes clinical information into two distinct categories (patient history and imaging findings), then separately analyzes this organized information to provide diagnoses, and (3) Summary-only approach: using only the LLM-generated summary for diagnoses.

Results: The two-step approach significantly outperformed the both baseline and summary-only approaches in diagnostic accuracy, as determined by McNemar's test. Primary diagnostic accuracy was 60.6% for the two-step approach, compared to 56.5% for baseline (p = 0.042) and 56.3% for summary-only (p = 0.035). For the top three diagnoses, accuracy was 70.5, 66.5, and 65.5% respectively (p = 0.005 for baseline, p = 0.008 for summary-only). No significant differences were observed between the baseline and summary-only approaches.

Conclusion: Our results indicate that a structured clinical reasoning approach enhances LLM's diagnostic accuracy. This method shows potential as a valuable tool for deriving diagnoses from free-text clinical information. The approach aligns well with established clinical reasoning processes, suggesting its potential applicability in real-world clinical settings.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Japanese Journal of Radiology Medicine-Radiology, Nuclear Medicine and Imaging

自引率

4.80%

发文量

133

期刊介绍： Japanese Journal of Radiology is a peer-reviewed journal, officially published by the Japan Radiological Society. The main purpose of the journal is to provide a forum for the publication of papers documenting recent advances and new developments in the field of radiology in medicine and biology. The scope of Japanese Journal of Radiology encompasses but is not restricted to diagnostic radiology, interventional radiology, radiation oncology, nuclear medicine, radiation physics, and radiation biology. Additionally, the journal covers technical and industrial innovations. The journal welcomes original articles, technical notes, review articles, pictorial essays and letters to the editor. The journal also provides announcements from the boards and the committees of the society. Membership in the Japan Radiological Society is not a prerequisite for submission. Contributions are welcomed from all parts of the world.