Explainable differential diagnosis with dual-inference large language models.

Npj health systems Pub Date : 2025-01-01 Epub Date: 2025-04-24 DOI:10.1038/s44401-025-00015-6

Shuang Zhou, Mingquan Lin, Sirui Ding, Jiashuo Wang, Canyu Chen, Genevieve B Melton, James Zou, Rui Zhang

{"title":"Explainable differential diagnosis with dual-inference large language models.","authors":"Shuang Zhou, Mingquan Lin, Sirui Ding, Jiashuo Wang, Canyu Chen, Genevieve B Melton, James Zou, Rui Zhang","doi":"10.1038/s44401-025-00015-6","DOIUrl":null,"url":null,"abstract":"<p><p>Automatic differential diagnosis (DDx) involves identifying potential conditions that could explain a patient's symptoms and its accurate interpretation is of substantial significance. While large language models (LLMs) have demonstrated remarkable diagnostic accuracy, their capability to generate high-quality DDx explanations remains underexplored, largely due to the absence of specialized evaluation datasets and the inherent challenges of complex reasoning in LLMs. Therefore, building a tailored dataset and developing novel methods to elicit LLMs for generating precise DDx explanations are worth exploring. We developed the first publicly available DDx dataset, comprising expert-derived explanations for 570 clinical notes, to evaluate DDx explanations. Meanwhile, we proposed a novel framework, Dual-Inf, that could effectively harness LLMs to generate high-quality DDx explanations. To the best of our knowledge, it is the first study to tailor LLMs for DDx explanation and comprehensively evaluate their explainability. Overall, our study bridges a critical gap in DDx explanation, enhancing clinical decision-making.</p>","PeriodicalId":520349,"journal":{"name":"Npj health systems","volume":"2 1","pages":"12"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12021655/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Npj health systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1038/s44401-025-00015-6","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/4/24 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Automatic differential diagnosis (DDx) involves identifying potential conditions that could explain a patient's symptoms and its accurate interpretation is of substantial significance. While large language models (LLMs) have demonstrated remarkable diagnostic accuracy, their capability to generate high-quality DDx explanations remains underexplored, largely due to the absence of specialized evaluation datasets and the inherent challenges of complex reasoning in LLMs. Therefore, building a tailored dataset and developing novel methods to elicit LLMs for generating precise DDx explanations are worth exploring. We developed the first publicly available DDx dataset, comprising expert-derived explanations for 570 clinical notes, to evaluate DDx explanations. Meanwhile, we proposed a novel framework, Dual-Inf, that could effectively harness LLMs to generate high-quality DDx explanations. To the best of our knowledge, it is the first study to tailor LLMs for DDx explanation and comprehensively evaluate their explainability. Overall, our study bridges a critical gap in DDx explanation, enhancing clinical decision-making.

查看原文本刊更多论文

双推理大语言模型的可解释鉴别诊断。

自动鉴别诊断（DDx）涉及识别可能解释患者症状的潜在条件，其准确解释具有重要意义。虽然大型语言模型（llm）已经证明了卓越的诊断准确性，但它们生成高质量DDx解释的能力仍未得到充分探索，这主要是由于缺乏专门的评估数据集以及llm复杂推理的固有挑战。因此，构建量身定制的数据集和开发新的方法来引出llm以生成精确的DDx解释是值得探索的。我们开发了第一个公开可用的DDx数据集，包括570个临床记录的专家衍生解释，以评估DDx解释。同时，我们提出了一个新的框架，Dual-Inf，它可以有效地利用llm来生成高质量的DDx解释。据我们所知，这是第一个针对DDx解释量身定制llm并全面评估其可解释性的研究。总的来说，我们的研究弥补了DDx解释的关键空白，增强了临床决策。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Npj health systems

自引率

0.00%

发文量