Filling the gaps: leveraging large language models for temporal harmonization of clinical text across multiple medical visits for clinical prediction

medRxiv - Intensive Care and Critical Care Medicine Pub Date : 2024-05-07 DOI:10.1101/2024.05.06.24306959

Inyoung Choi, Qi Long, Emily Getzen

{"title":"Filling the gaps: leveraging large language models for temporal harmonization of clinical text across multiple medical visits for clinical prediction","authors":"Inyoung Choi, Qi Long, Emily Getzen","doi":"10.1101/2024.05.06.24306959","DOIUrl":null,"url":null,"abstract":"Electronic health records offer great promise for early disease detection, treatment evaluation, information discovery, and other important facets of precision health. Clinical notes, in particular, may contain nuanced information about a patient’s condition, treatment plans, and history that structured data may not capture. As a result, and with advancements in natural language processing, clinical notes have been increasingly used in supervised prediction models. To predict long-term outcomes such as chronic disease and mortality, it is often advantageous to leverage data occurring at multiple time points in a patient’s history. However, these data are often collected at irregular time intervals and varying frequencies, thus posing an analytical challenge. Here, we propose the use of large language models (LLMs) for robust temporal harmonization of clinical notes across multiple visits. We compare multiple state-of-the-art LLMs in their ability to generate useful information during time gaps, and evaluate performance in supervised deep learning models for clinical prediction.","PeriodicalId":501249,"journal":{"name":"medRxiv - Intensive Care and Critical Care Medicine","volume":"43 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv - Intensive Care and Critical Care Medicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.05.06.24306959","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Electronic health records offer great promise for early disease detection, treatment evaluation, information discovery, and other important facets of precision health. Clinical notes, in particular, may contain nuanced information about a patient’s condition, treatment plans, and history that structured data may not capture. As a result, and with advancements in natural language processing, clinical notes have been increasingly used in supervised prediction models. To predict long-term outcomes such as chronic disease and mortality, it is often advantageous to leverage data occurring at multiple time points in a patient’s history. However, these data are often collected at irregular time intervals and varying frequencies, thus posing an analytical challenge. Here, we propose the use of large language models (LLMs) for robust temporal harmonization of clinical notes across multiple visits. We compare multiple state-of-the-art LLMs in their ability to generate useful information during time gaps, and evaluate performance in supervised deep learning models for clinical prediction.

查看原文本刊更多论文

填补空白：利用大型语言模型对多次就诊的临床文本进行时间协调，以进行临床预测

电子健康记录为早期疾病检测、治疗评估、信息发现以及精准健康的其他重要方面带来了巨大的希望。尤其是临床笔记，可能包含结构化数据无法捕捉到的有关患者病情、治疗计划和病史的细微信息。因此，随着自然语言处理技术的进步，临床笔记越来越多地被用于监督预测模型中。要预测慢性病和死亡率等长期结果，利用患者病史中多个时间点的数据往往是有利的。然而，这些数据通常是以不规则的时间间隔和不同的频率收集的，因此给分析带来了挑战。在此，我们建议使用大型语言模型（LLM）对多次就诊的临床笔记进行稳健的时间协调。我们比较了多种最先进的 LLM 在时间间隙中生成有用信息的能力，并评估了用于临床预测的有监督深度学习模型的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

medRxiv - Intensive Care and Critical Care Medicine

自引率

0.00%

发文量