Yan Kang , Mingjian Yang , Yue Peng , Jingwen Cai , Lei Zhao , Zhan Gao , Ningshu Li , Bin Pu
{"title":"LLM-DG: Leveraging large language model for enhanced disease prediction via inter-patient and intra-patient modeling","authors":"Yan Kang , Mingjian Yang , Yue Peng , Jingwen Cai , Lei Zhao , Zhan Gao , Ningshu Li , Bin Pu","doi":"10.1016/j.inffus.2025.103145","DOIUrl":null,"url":null,"abstract":"<div><div>Existing methods play a crucial role in clinical decision support by enabling disease prediction and personalizing healthcare based on swiftly accumulated electronic Health Records (EHRs). However, these methods often overlook multi-source data integration by relying solely on specific domain knowledge and fail to model intricate relationships among patients as focusing on inter or intra-patient relationships, respectively. To address these limitations, we propose LLM-DG, a multi-level health event prediction framework enhanced by large language models (LLMs). Specifically, LLM performs semantic enhancement for patient and discharge summary representations and injects domain knowledge into disease modeling, improving prediction accuracy and robustness. Moreover, LLM-DG synchronously models inter-patient and intra-patient relationships by capturing high-order patient correlations and fusing dynamic and static patient features. At the inter-patient level, LLM-DG clusters patients based on LLM-enhanced features, identifying similar health trajectories. At the intra-patient level, it models disease evolution characteristics through a dynamic graph and extracts textual information from LLM-enhanced discharge summaries using a text encoder. Experiments on MIMIC-III and MIMIC-IV datasets demonstrate that LLM-DG significantly outperforms state-of-the-art models, achieving a 12.39% improvement in <span><math><mrow><mi>w</mi><mtext>-F1</mtext></mrow></math></span> on the diagnosis prediction task of the MIMIC-IV dataset. Overall, LLM-DG demonstrates strong potential in complex healthcare environments by integrating patient histories and cross-patient health patterns, highlighting its applicability in clinical decision support and personalized treatment planning.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"121 ","pages":"Article 103145"},"PeriodicalIF":14.7000,"publicationDate":"2025-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253525002180","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Existing methods play a crucial role in clinical decision support by enabling disease prediction and personalizing healthcare based on swiftly accumulated electronic Health Records (EHRs). However, these methods often overlook multi-source data integration by relying solely on specific domain knowledge and fail to model intricate relationships among patients as focusing on inter or intra-patient relationships, respectively. To address these limitations, we propose LLM-DG, a multi-level health event prediction framework enhanced by large language models (LLMs). Specifically, LLM performs semantic enhancement for patient and discharge summary representations and injects domain knowledge into disease modeling, improving prediction accuracy and robustness. Moreover, LLM-DG synchronously models inter-patient and intra-patient relationships by capturing high-order patient correlations and fusing dynamic and static patient features. At the inter-patient level, LLM-DG clusters patients based on LLM-enhanced features, identifying similar health trajectories. At the intra-patient level, it models disease evolution characteristics through a dynamic graph and extracts textual information from LLM-enhanced discharge summaries using a text encoder. Experiments on MIMIC-III and MIMIC-IV datasets demonstrate that LLM-DG significantly outperforms state-of-the-art models, achieving a 12.39% improvement in on the diagnosis prediction task of the MIMIC-IV dataset. Overall, LLM-DG demonstrates strong potential in complex healthcare environments by integrating patient histories and cross-patient health patterns, highlighting its applicability in clinical decision support and personalized treatment planning.
期刊介绍:
Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.