Graph-Text Multi-Modal Pre-training for Medical Representation Learning

Proceedings of the ACM Conference on Health, Inference, and Learning Pub Date : 2022-03-18 DOI:10.48550/arXiv.2203.09994

Sungjin Park, Seongsu Bae, Jiho Kim, Tackeun Kim, E. Choi

{"title":"Graph-Text Multi-Modal Pre-training for Medical Representation Learning","authors":"Sungjin Park, Seongsu Bae, Jiho Kim, Tackeun Kim, E. Choi","doi":"10.48550/arXiv.2203.09994","DOIUrl":null,"url":null,"abstract":"As the volume of Electronic Health Records (EHR) sharply grows, there has been emerging interest in learning the representation of EHR for healthcare applications. Representation learning of EHR requires appropriate modeling of the two dominant modalities in EHR: structured data and unstructured text. In this paper, we present MedGTX, a pre-trained model for multi-modal representation learning of the structured and textual EHR data. MedGTX uses a novel graph encoder to exploit the graphical nature of structured EHR data, and a text encoder to handle unstructured text, and a cross-modal encoder to learn a joint representation space. We pre-train our model through four proxy tasks on MIMIC-III, an open-source EHR data, and evaluate our model on two clinical benchmarks and three novel downstream tasks which tackle real-world problems in EHR data. The results consistently show the effectiveness of pre-training the model for joint representation of both structured and unstructured information from EHR. Given the promising performance of MedGTX, we believe this work opens a new door to jointly understanding the two fundamental modalities of EHR data.","PeriodicalId":87342,"journal":{"name":"Proceedings of the ACM Conference on Health, Inference, and Learning","volume":"48 1","pages":"261-281"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ACM Conference on Health, Inference, and Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2203.09994","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 7

Abstract

As the volume of Electronic Health Records (EHR) sharply grows, there has been emerging interest in learning the representation of EHR for healthcare applications. Representation learning of EHR requires appropriate modeling of the two dominant modalities in EHR: structured data and unstructured text. In this paper, we present MedGTX, a pre-trained model for multi-modal representation learning of the structured and textual EHR data. MedGTX uses a novel graph encoder to exploit the graphical nature of structured EHR data, and a text encoder to handle unstructured text, and a cross-modal encoder to learn a joint representation space. We pre-train our model through four proxy tasks on MIMIC-III, an open-source EHR data, and evaluate our model on two clinical benchmarks and three novel downstream tasks which tackle real-world problems in EHR data. The results consistently show the effectiveness of pre-training the model for joint representation of both structured and unstructured information from EHR. Given the promising performance of MedGTX, we believe this work opens a new door to jointly understanding the two fundamental modalities of EHR data.

查看原文本刊更多论文

医学表征学习的图文多模态预训练

随着电子健康记录(EHR)数量的急剧增长，人们对学习医疗保健应用中EHR的表示越来越感兴趣。电子病历的表示学习需要对电子病历中的两种主要模式:结构化数据和非结构化文本进行适当的建模。在本文中，我们提出MedGTX，一个用于结构化和文本电子病历数据的多模态表示学习的预训练模型。MedGTX使用一种新颖的图形编码器来利用结构化电子病历数据的图形特性，使用文本编码器来处理非结构化文本，使用跨模态编码器来学习联合表示空间。我们通过在开源电子病历数据MIMIC-III上的四个代理任务对模型进行预训练，并在两个临床基准和三个解决电子病历数据中现实问题的新下游任务上评估我们的模型。结果一致表明预训练模型对电子病历中结构化和非结构化信息的联合表示是有效的。鉴于MedGTX的良好表现，我们相信这项工作为共同理解电子病历数据的两种基本模式打开了一扇新的大门。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the ACM Conference on Health, Inference, and Learning

自引率

0.00%

发文量