Multi-modal fusion model for Time-Varying medical Data: Addressing Long-Term dependencies and memory challenges in sequence fusion

IF 4 2区 医学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
Moxuan Ma , Muyu Wang , Lan Wei , Xiaolu Fei , Hui Chen
{"title":"Multi-modal fusion model for Time-Varying medical Data: Addressing Long-Term dependencies and memory challenges in sequence fusion","authors":"Moxuan Ma ,&nbsp;Muyu Wang ,&nbsp;Lan Wei ,&nbsp;Xiaolu Fei ,&nbsp;Hui Chen","doi":"10.1016/j.jbi.2025.104823","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Multi-modal time-varying data continuously generated during a patient’s hospitalization reflects the patient’s disease progression. Certain patient conditions may be associated with long-term states, which is a weakness of current medical multi-modal time-varying data fusion models. Daily ward round notes, as time-series long texts, are often neglected by models.</div></div><div><h3>Objective</h3><div>This study aims to develop an effective medical multi-modal time-varying data fusion model capable of extracting features from long sequences and long texts while capturing long-term dependencies.</div></div><div><h3>Methods</h3><div>We proposed a model called medical multi-modal fusion for long-term dependencies (MMF-LD) that fuses time-varying and time-invariant, tabular, and textual data. A progressive multi-modal fusion (PMF) strategy was introduced to address information loss in multi-modal time series fusion, particularly for long time-varying texts. With the integration of the attention mechanism, the long short-term storage memory (LSTsM) gained enhanced capacity to extract long-term dependencies. In conjunction with the temporal convolutional network (TCN), it extracted long-term features from time-varying sequences without neglecting the local contextual information of the time series. Model performance was evaluated on acute myocardial infarction (AMI) and stroke datasets for in-hospital mortality risk prediction and long length-of-stay prediction. area under the receiver operating characteristic curve (AUROC), area under the precision-recall curve (AUPRC), and F1 score were used as evaluation metrics for model performance.</div></div><div><h3>Results</h3><div>The MMF-LD model demonstrated superior performance compared to other multi-modal time-varying data fusion models in model comparison experiments (AUROC: 0.947 and 0.918 in the AMI dataset, and 0.965 and 0.868 in the stroke dataset; AUPRC: 0.410 and 0.675, and 0.467 and 0.533; F1 score: 0.658 and 0.513, and 0.684 and 0.401). Ablation experiments confirmed that the proposed PMF strategy, LSTsM, and TCN modules all contributed to performance improvements as intended.</div></div><div><h3>Conclusions</h3><div>The proposed medical multi-modal time-varying data fusion architecture addresses the challenge of forgetting time-varying long textual information in time series fusion. It exhibits stable performance across multiple datasets and tasks. It exhibits strength in capturing long-term dependencies and shows stable performance across multiple datasets and tasks.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"165 ","pages":"Article 104823"},"PeriodicalIF":4.0000,"publicationDate":"2025-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Biomedical Informatics","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1532046425000528","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

Background

Multi-modal time-varying data continuously generated during a patient’s hospitalization reflects the patient’s disease progression. Certain patient conditions may be associated with long-term states, which is a weakness of current medical multi-modal time-varying data fusion models. Daily ward round notes, as time-series long texts, are often neglected by models.

Objective

This study aims to develop an effective medical multi-modal time-varying data fusion model capable of extracting features from long sequences and long texts while capturing long-term dependencies.

Methods

We proposed a model called medical multi-modal fusion for long-term dependencies (MMF-LD) that fuses time-varying and time-invariant, tabular, and textual data. A progressive multi-modal fusion (PMF) strategy was introduced to address information loss in multi-modal time series fusion, particularly for long time-varying texts. With the integration of the attention mechanism, the long short-term storage memory (LSTsM) gained enhanced capacity to extract long-term dependencies. In conjunction with the temporal convolutional network (TCN), it extracted long-term features from time-varying sequences without neglecting the local contextual information of the time series. Model performance was evaluated on acute myocardial infarction (AMI) and stroke datasets for in-hospital mortality risk prediction and long length-of-stay prediction. area under the receiver operating characteristic curve (AUROC), area under the precision-recall curve (AUPRC), and F1 score were used as evaluation metrics for model performance.

Results

The MMF-LD model demonstrated superior performance compared to other multi-modal time-varying data fusion models in model comparison experiments (AUROC: 0.947 and 0.918 in the AMI dataset, and 0.965 and 0.868 in the stroke dataset; AUPRC: 0.410 and 0.675, and 0.467 and 0.533; F1 score: 0.658 and 0.513, and 0.684 and 0.401). Ablation experiments confirmed that the proposed PMF strategy, LSTsM, and TCN modules all contributed to performance improvements as intended.

Conclusions

The proposed medical multi-modal time-varying data fusion architecture addresses the challenge of forgetting time-varying long textual information in time series fusion. It exhibits stable performance across multiple datasets and tasks. It exhibits strength in capturing long-term dependencies and shows stable performance across multiple datasets and tasks.

Abstract Image

时变医疗数据的多模态融合模型:解决序列融合中的长期依赖和记忆挑战
患者住院期间不断产生的多模态时变数据反映了患者的疾病进展。某些患者的病情可能与长期状态有关,这是当前医学多模态时变数据融合模型的一个弱点。每日查房笔记作为时间序列的长文本,常常被模型所忽略。目的建立一种有效的医学多模态时变数据融合模型,能够从长序列和长文本中提取特征,同时捕获长期依赖关系。方法提出了一种医学多模态长期依赖融合(MMF-LD)模型,该模型融合了时变和定常、表格和文本数据。针对多模态时间序列融合中的信息丢失问题,提出了一种渐进式多模态融合策略。随着注意机制的整合,长短期存储记忆(LSTsM)提取长期依赖的能力增强。该方法与时间卷积网络(TCN)相结合,在不忽略时间序列局部上下文信息的情况下,从时变序列中提取长期特征。在急性心肌梗死(AMI)和卒中数据集上评估模型的性能,用于院内死亡风险预测和住院时间预测。以受试者工作特征曲线下面积(AUROC)、精确召回率曲线下面积(AUPRC)和F1分数作为模型性能的评价指标。结果MMF-LD模型在模型对比实验中表现出优于其他多模态时变数据融合模型的性能(AMI数据集的AUROC分别为0.947和0.918,卒中数据集的AUROC分别为0.965和0.868;AUPRC: 0.410和0.675,0.467和0.533;F1得分分别为0.658和0.513,0.684和0.401)。烧蚀实验证实,所提出的PMF策略、LSTsM和TCN模块都有助于预期的性能改进。结论提出的医学多模态时变数据融合架构解决了时间序列融合中时变长文本信息遗忘的问题。它在多个数据集和任务中表现出稳定的性能。它在捕获长期依赖关系方面表现出强大的能力,并在多个数据集和任务之间显示出稳定的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Biomedical Informatics
Journal of Biomedical Informatics 医学-计算机:跨学科应用
CiteScore
8.90
自引率
6.70%
发文量
243
审稿时长
32 days
期刊介绍: The Journal of Biomedical Informatics reflects a commitment to high-quality original research papers, reviews, and commentaries in the area of biomedical informatics methodology. Although we publish articles motivated by applications in the biomedical sciences (for example, clinical medicine, health care, population health, and translational bioinformatics), the journal emphasizes reports of new methodologies and techniques that have general applicability and that form the basis for the evolving science of biomedical informatics. Articles on medical devices; evaluations of implemented systems (including clinical trials of information technologies); or papers that provide insight into a biological process, a specific disease, or treatment options would generally be more suitable for publication in other venues. Papers on applications of signal processing and image analysis are often more suitable for biomedical engineering journals or other informatics journals, although we do publish papers that emphasize the information management and knowledge representation/modeling issues that arise in the storage and use of biological signals and images. System descriptions are welcome if they illustrate and substantiate the underlying methodology that is the principal focus of the report and an effort is made to address the generalizability and/or range of application of that methodology. Note also that, given the international nature of JBI, papers that deal with specific languages other than English, or with country-specific health systems or approaches, are acceptable for JBI only if they offer generalizable lessons that are relevant to the broad JBI readership, regardless of their country, language, culture, or health system.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信