纵向病历与医学知识图谱相互增强的临床事件预测建模

Xiao Xu, Xian Xu, Yuyao Sun, Xiaoshuang Liu, Xiang Li, G. Xie, Fei Wang
{"title":"纵向病历与医学知识图谱相互增强的临床事件预测建模","authors":"Xiao Xu, Xian Xu, Yuyao Sun, Xiaoshuang Liu, Xiang Li, G. Xie, Fei Wang","doi":"10.1109/ICDM51629.2021.00089","DOIUrl":null,"url":null,"abstract":"In recent years, with the better availability of medical data such as Electronic Health Records (EHR), more and more data mining models have been developed to explore the data-driven insights for better human health. However, there are many challenges for analyzing EHR such as high-dimensionality, temporality, sparsity, etc., which make the data-driven models less reliable. Medical knowledge graph (MKG), which encodes comprehensive knowledge about the medical concepts and relationships extracted from medical literature, holds great promise to regularize the data-driven models as prior knowledge. Nonetheless, the MKGs are typically not complete, which limits its utility in helping with the data mining process. In this paper, we propose a mutual enhancement framework MendMKG for predictive modeling of clinical events with both EHR and MKG. In particular, MendMKG first conducts a self-supervised learning strategy to simultaneously pre-train a graph attention network for embedding nodes and complete the MKG. It iteratively performs (1) an embedding-based knowledge graph completion module to derive missing edges, (2) and a reconstruction module of unlabeled EHR data to select high-quality ones from these edges, which would be further appended to the MKG to update the embedding model. Through the iterations, the two modules mutually benefit each other. Then, MendMKG uses the pre-trained graph attention network and the updated MKG to generate the visit embeddings to represent patient’s historical visits, and predict the diagnosis in future visit, through a fine-tuning approach. Experimental results on real world EHR corpus are provided to demonstrate the superiority of the proposed framework, compared to a series of state-of-the-art baselines.11The source code and knowledge graph data have been anonymously uploaded to https://github.com/1317375434/MendMKG.","PeriodicalId":320970,"journal":{"name":"2021 IEEE International Conference on Data Mining (ICDM)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Predictive Modeling of Clinical Events with Mutual Enhancement Between Longitudinal Patient Records and Medical Knowledge Graph\",\"authors\":\"Xiao Xu, Xian Xu, Yuyao Sun, Xiaoshuang Liu, Xiang Li, G. Xie, Fei Wang\",\"doi\":\"10.1109/ICDM51629.2021.00089\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, with the better availability of medical data such as Electronic Health Records (EHR), more and more data mining models have been developed to explore the data-driven insights for better human health. However, there are many challenges for analyzing EHR such as high-dimensionality, temporality, sparsity, etc., which make the data-driven models less reliable. Medical knowledge graph (MKG), which encodes comprehensive knowledge about the medical concepts and relationships extracted from medical literature, holds great promise to regularize the data-driven models as prior knowledge. Nonetheless, the MKGs are typically not complete, which limits its utility in helping with the data mining process. In this paper, we propose a mutual enhancement framework MendMKG for predictive modeling of clinical events with both EHR and MKG. In particular, MendMKG first conducts a self-supervised learning strategy to simultaneously pre-train a graph attention network for embedding nodes and complete the MKG. It iteratively performs (1) an embedding-based knowledge graph completion module to derive missing edges, (2) and a reconstruction module of unlabeled EHR data to select high-quality ones from these edges, which would be further appended to the MKG to update the embedding model. Through the iterations, the two modules mutually benefit each other. Then, MendMKG uses the pre-trained graph attention network and the updated MKG to generate the visit embeddings to represent patient’s historical visits, and predict the diagnosis in future visit, through a fine-tuning approach. Experimental results on real world EHR corpus are provided to demonstrate the superiority of the proposed framework, compared to a series of state-of-the-art baselines.11The source code and knowledge graph data have been anonymously uploaded to https://github.com/1317375434/MendMKG.\",\"PeriodicalId\":320970,\"journal\":{\"name\":\"2021 IEEE International Conference on Data Mining (ICDM)\",\"volume\":\"10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE International Conference on Data Mining (ICDM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDM51629.2021.00089\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Data Mining (ICDM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDM51629.2021.00089","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

摘要

近年来,随着电子健康记录(EHR)等医疗数据可用性的提高,越来越多的数据挖掘模型被开发出来,以探索数据驱动的见解,以改善人类健康。然而,电子病历分析存在高维性、时效性、稀疏性等问题,使得数据驱动模型的可靠性降低。医学知识图(MKG)将从医学文献中提取的关于医学概念和医学关系的全面知识进行编码,有望将数据驱动模型作为先验知识进行规范化。然而,mkg通常是不完整的,这限制了它在帮助数据挖掘过程中的效用。在本文中,我们提出了一个相互增强的框架MendMKG,用于EHR和MKG的临床事件预测建模。其中,MendMKG首先采用自监督学习策略,同时对嵌入节点的图注意网络进行预训练,完成MKG。它迭代执行(1)基于嵌入的知识图补全模块来导出缺失边;(2)对未标记的EHR数据进行重构模块,从这些边缘中选择高质量的边缘,并将其追加到MKG中以更新嵌入模型。通过迭代,两个模块相互受益。然后,MendMKG使用预先训练好的图关注网络和更新后的MKG生成就诊嵌入来表示患者的历史就诊,并通过微调方法预测未来就诊的诊断。在真实世界的EHR语料库上提供的实验结果表明,与一系列最先进的基线相比,所提出的框架具有优越性。11源代码和知识图谱数据已匿名上传到https://github.com/1317375434/MendMKG。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Predictive Modeling of Clinical Events with Mutual Enhancement Between Longitudinal Patient Records and Medical Knowledge Graph
In recent years, with the better availability of medical data such as Electronic Health Records (EHR), more and more data mining models have been developed to explore the data-driven insights for better human health. However, there are many challenges for analyzing EHR such as high-dimensionality, temporality, sparsity, etc., which make the data-driven models less reliable. Medical knowledge graph (MKG), which encodes comprehensive knowledge about the medical concepts and relationships extracted from medical literature, holds great promise to regularize the data-driven models as prior knowledge. Nonetheless, the MKGs are typically not complete, which limits its utility in helping with the data mining process. In this paper, we propose a mutual enhancement framework MendMKG for predictive modeling of clinical events with both EHR and MKG. In particular, MendMKG first conducts a self-supervised learning strategy to simultaneously pre-train a graph attention network for embedding nodes and complete the MKG. It iteratively performs (1) an embedding-based knowledge graph completion module to derive missing edges, (2) and a reconstruction module of unlabeled EHR data to select high-quality ones from these edges, which would be further appended to the MKG to update the embedding model. Through the iterations, the two modules mutually benefit each other. Then, MendMKG uses the pre-trained graph attention network and the updated MKG to generate the visit embeddings to represent patient’s historical visits, and predict the diagnosis in future visit, through a fine-tuning approach. Experimental results on real world EHR corpus are provided to demonstrate the superiority of the proposed framework, compared to a series of state-of-the-art baselines.11The source code and knowledge graph data have been anonymously uploaded to https://github.com/1317375434/MendMKG.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信