Comparing Text-Based Clinical Risk Prediction in Critical Care: A Note-Specific Hierarchical Network and Large Language Models.

IF 6.7 2区医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Journal of Biomedical and Health Informatics Pub Date : 2025-05-27 DOI:10.1109/JBHI.2025.3574254

Jinghui Liu, Anthony Nguyen, Daniel Capurro, Karin Verspoor

{"title":"Comparing Text-Based Clinical Risk Prediction in Critical Care: A Note-Specific Hierarchical Network and Large Language Models.","authors":"Jinghui Liu, Anthony Nguyen, Daniel Capurro, Karin Verspoor","doi":"10.1109/JBHI.2025.3574254","DOIUrl":null,"url":null,"abstract":"<p><p>Clinical predictive analysis is a crucial task with numerous applications and has been extensively studied using machine learning approaches. Clinical notes, a vital data source, have been employed to develop natural language processing (NLP) models for risk prediction in healthcare with robust performance. However, clinical notes vary considerably in text composition-written by diverse healthcare providers for different purposes-and the impact of these variations on NLP modeling is also underexplored. It also remains uncertain whether the recent Large Language Models (LLMs) with instruction-following capabilities can effectively handle the risk prediction task out-of-the-box, especially when using routinely collected clinical notes instead of polished text. We address these two important research questions in the context of in-hospital mortality prediction within the critical care setting. Specifically, we propose a supervised hierarchical network with note-specific modules to account for variations across different note categories, and provide a detailed comparison with strong supervised baselines and LLMs. We benchmark 34 instruction-following LLMs based on zero-shot, few-shot, and chain-of-thought prompting with diverse prompt templates. Our results demonstrate that the note-specific network delivers improved risk prediction performance compared to established supervised baselines from both measurement-based and text-based modeling. In contrast, LLMs consistently underperform on this critical task, despite their remarkable performances in other domains. This highlights important limitations and raises caution regarding the use of LLMs for risk assessment in the critical setting. Additionally, we show that the proposed model can be leveraged to select informative clinical notes to enhance the training of other models.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7000,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Journal of Biomedical and Health Informatics","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1109/JBHI.2025.3574254","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Clinical predictive analysis is a crucial task with numerous applications and has been extensively studied using machine learning approaches. Clinical notes, a vital data source, have been employed to develop natural language processing (NLP) models for risk prediction in healthcare with robust performance. However, clinical notes vary considerably in text composition-written by diverse healthcare providers for different purposes-and the impact of these variations on NLP modeling is also underexplored. It also remains uncertain whether the recent Large Language Models (LLMs) with instruction-following capabilities can effectively handle the risk prediction task out-of-the-box, especially when using routinely collected clinical notes instead of polished text. We address these two important research questions in the context of in-hospital mortality prediction within the critical care setting. Specifically, we propose a supervised hierarchical network with note-specific modules to account for variations across different note categories, and provide a detailed comparison with strong supervised baselines and LLMs. We benchmark 34 instruction-following LLMs based on zero-shot, few-shot, and chain-of-thought prompting with diverse prompt templates. Our results demonstrate that the note-specific network delivers improved risk prediction performance compared to established supervised baselines from both measurement-based and text-based modeling. In contrast, LLMs consistently underperform on this critical task, despite their remarkable performances in other domains. This highlights important limitations and raises caution regarding the use of LLMs for risk assessment in the critical setting. Additionally, we show that the proposed model can be leveraged to select informative clinical notes to enhance the training of other models.

查看原文本刊更多论文

比较基于文本的临床风险预测在重症监护：一个笔记特定的层次网络和大型语言模型。

临床预测分析是一项具有众多应用的关键任务，并且已经使用机器学习方法进行了广泛的研究。临床记录，一个重要的数据源，已被用于开发自然语言处理（NLP）模型的风险预测在医疗保健稳健的性能。然而，临床记录的文本组成差异很大——由不同的医疗保健提供者为不同的目的撰写——这些差异对NLP建模的影响也未得到充分探讨。目前还不确定的是，最近具有指令跟随功能的大型语言模型（llm）是否能够有效地处理开箱即用的风险预测任务，特别是当使用常规收集的临床记录而不是修饰过的文本时。我们解决这两个重要的研究问题，在院内死亡率预测的背景下，在重症监护设置。具体来说，我们提出了一个带有音符特定模块的监督分层网络，以解释不同音符类别之间的变化，并提供了与强监督基线和llm的详细比较。我们对34个指令遵循llm进行了基准测试，这些llm基于零射击、少射击和具有不同提示模板的思维链提示。我们的研究结果表明，与基于测量和基于文本的建模建立的监督基线相比，特定笔记网络提供了更好的风险预测性能。相比之下，法学硕士在这一关键任务上一直表现不佳，尽管他们在其他领域表现出色。这突出了重要的局限性，并提出了在关键环境中使用llm进行风险评估的谨慎态度。此外，我们表明，所提出的模型可以用来选择信息丰富的临床记录，以增强其他模型的训练。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Journal of Biomedical and Health Informatics COMPUTER SCIENCE, INFORMATION SYSTEMS-COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

CiteScore

13.60

自引率

6.50%

发文量

1151

期刊介绍： IEEE Journal of Biomedical and Health Informatics publishes original papers presenting recent advances where information and communication technologies intersect with health, healthcare, life sciences, and biomedicine. Topics include acquisition, transmission, storage, retrieval, management, and analysis of biomedical and health information. The journal covers applications of information technologies in healthcare, patient monitoring, preventive care, early disease diagnosis, therapy discovery, and personalized treatment protocols. It explores electronic medical and health records, clinical information systems, decision support systems, medical and biological imaging informatics, wearable systems, body area/sensor networks, and more. Integration-related topics like interoperability, evidence-based medicine, and secure patient data are also addressed.