双推理大语言模型的可解释鉴别诊断。

Npj health systems Pub Date : 2025-01-01 Epub Date: 2025-04-24 DOI:10.1038/s44401-025-00015-6

Shuang Zhou, Mingquan Lin, Sirui Ding, Jiashuo Wang, Canyu Chen, Genevieve B Melton, James Zou, Rui Zhang

{"title":"双推理大语言模型的可解释鉴别诊断。","authors":"Shuang Zhou, Mingquan Lin, Sirui Ding, Jiashuo Wang, Canyu Chen, Genevieve B Melton, James Zou, Rui Zhang","doi":"10.1038/s44401-025-00015-6","DOIUrl":null,"url":null,"abstract":"Automatic differential diagnosis (DDx) involves identifying potential conditions that could explain a patient's symptoms and its accurate interpretation is of substantial significance. While large language models (LLMs) have demonstrated remarkable diagnostic accuracy, their capability to generate high-quality DDx explanations remains underexplored, largely due to the absence of specialized evaluation datasets and the inherent challenges of complex reasoning in LLMs. Therefore, building a tailored dataset and developing novel methods to elicit LLMs for generating precise DDx explanations are worth exploring. We developed the first publicly available DDx dataset, comprising expert-derived explanations for 570 clinical notes, to evaluate DDx explanations. Meanwhile, we proposed a novel framework, Dual-Inf, that could effectively harness LLMs to generate high-quality DDx explanations. To the best of our knowledge, it is the first study to tailor LLMs for DDx explanation and comprehensively evaluate their explainability. Overall, our study bridges a critical gap in DDx explanation, enhancing clinical decision-making.","PeriodicalId":520349,"journal":{"name":"Npj health systems","volume":"2 1","pages":"12"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12021655/pdf/","citationCount":"0","resultStr":"{\"title\":\"Explainable differential diagnosis with dual-inference large language models.\",\"authors\":\"Shuang Zhou, Mingquan Lin, Sirui Ding, Jiashuo Wang, Canyu Chen, Genevieve B Melton, James Zou, Rui Zhang\",\"doi\":\"10.1038/s44401-025-00015-6\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Automatic differential diagnosis (DDx) involves identifying potential conditions that could explain a patient's symptoms and its accurate interpretation is of substantial significance. While large language models (LLMs) have demonstrated remarkable diagnostic accuracy, their capability to generate high-quality DDx explanations remains underexplored, largely due to the absence of specialized evaluation datasets and the inherent challenges of complex reasoning in LLMs. Therefore, building a tailored dataset and developing novel methods to elicit LLMs for generating precise DDx explanations are worth exploring. We developed the first publicly available DDx dataset, comprising expert-derived explanations for 570 clinical notes, to evaluate DDx explanations. Meanwhile, we proposed a novel framework, Dual-Inf, that could effectively harness LLMs to generate high-quality DDx explanations. To the best of our knowledge, it is the first study to tailor LLMs for DDx explanation and comprehensively evaluate their explainability. Overall, our study bridges a critical gap in DDx explanation, enhancing clinical decision-making.\",\"PeriodicalId\":520349,\"journal\":{\"name\":\"Npj health systems\",\"volume\":\"2 1\",\"pages\":\"12\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12021655/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Npj health systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1038/s44401-025-00015-6\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/4/24 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Npj health systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1038/s44401-025-00015-6","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/4/24 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

自动鉴别诊断（DDx）涉及识别可能解释患者症状的潜在条件，其准确解释具有重要意义。虽然大型语言模型（llm）已经证明了卓越的诊断准确性，但它们生成高质量DDx解释的能力仍未得到充分探索，这主要是由于缺乏专门的评估数据集以及llm复杂推理的固有挑战。因此，构建量身定制的数据集和开发新的方法来引出llm以生成精确的DDx解释是值得探索的。我们开发了第一个公开可用的DDx数据集，包括570个临床记录的专家衍生解释，以评估DDx解释。同时，我们提出了一个新的框架，Dual-Inf，它可以有效地利用llm来生成高质量的DDx解释。据我们所知，这是第一个针对DDx解释量身定制llm并全面评估其可解释性的研究。总的来说，我们的研究弥补了DDx解释的关键空白，增强了临床决策。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Explainable differential diagnosis with dual-inference large language models.

Automatic differential diagnosis (DDx) involves identifying potential conditions that could explain a patient's symptoms and its accurate interpretation is of substantial significance. While large language models (LLMs) have demonstrated remarkable diagnostic accuracy, their capability to generate high-quality DDx explanations remains underexplored, largely due to the absence of specialized evaluation datasets and the inherent challenges of complex reasoning in LLMs. Therefore, building a tailored dataset and developing novel methods to elicit LLMs for generating precise DDx explanations are worth exploring. We developed the first publicly available DDx dataset, comprising expert-derived explanations for 570 clinical notes, to evaluate DDx explanations. Meanwhile, we proposed a novel framework, Dual-Inf, that could effectively harness LLMs to generate high-quality DDx explanations. To the best of our knowledge, it is the first study to tailor LLMs for DDx explanation and comprehensively evaluate their explainability. Overall, our study bridges a critical gap in DDx explanation, enhancing clinical decision-making.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Npj health systems

自引率

0.00%

发文量