Embedding-based link predictions to explore latent comorbidity of chronic diseases.

IF 4.7 3区 医学 Q1 MEDICAL INFORMATICS
Health Information Science and Systems Pub Date : 2022-12-30 eCollection Date: 2023-12-01 DOI:10.1007/s13755-022-00206-7
Haohui Lu, Shahadat Uddin
{"title":"Embedding-based link predictions to explore latent comorbidity of chronic diseases.","authors":"Haohui Lu, Shahadat Uddin","doi":"10.1007/s13755-022-00206-7","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>Comorbidity is a term used to describe when a patient simultaneously has more than one chronic disease. Comorbidity is a significant health issue that affects people worldwide. This study aims to use machine learning and graph theory to predict the comorbidity of chronic diseases.</p><p><strong>Methods: </strong>A patient-disease bipartite graph is constructed based on the administrative claim data. The bipartite graph projection approach was used to create the comorbidity network. For the link prediction task, three graph machine learning embedding-based models (node2vec, graph neural networks and hand-crafted approach) with different variants were used on the comorbidity network to compare their performance. This study also considered three commonly used similarity-based link prediction approaches (Jaccard coefficient, Adamic-Adar index and Resource allocation index) for performance comparison.</p><p><strong>Results: </strong>The results showed that the embedding-based hand-crafted features technique achieved outstanding performance compared with the remaining similarity-based and embedding-based models. Especially, the hand-crafted technique with the extreme gradient boosting classifier achieved the highest accuracy (91.67%), followed by the same technique with the Logistic regression classifier (90.26%). For this shallow embedding method, the Jaccard coefficient and the degree centrality of the original chronic disease were the most important features for comorbidity prediction.</p><p><strong>Conclusion: </strong>The proposed framework can be used to predict the comorbidity of chronic disease at an early stage of hospital admission. Thus, the prediction outcome could be valuable for medical practice, giving healthcare providers more control over their services and lowering expenses.</p>","PeriodicalId":46312,"journal":{"name":"Health Information Science and Systems","volume":"11 1","pages":"2"},"PeriodicalIF":4.7000,"publicationDate":"2022-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9803807/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Health Information Science and Systems","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s13755-022-00206-7","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/12/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0

Abstract

Purpose: Comorbidity is a term used to describe when a patient simultaneously has more than one chronic disease. Comorbidity is a significant health issue that affects people worldwide. This study aims to use machine learning and graph theory to predict the comorbidity of chronic diseases.

Methods: A patient-disease bipartite graph is constructed based on the administrative claim data. The bipartite graph projection approach was used to create the comorbidity network. For the link prediction task, three graph machine learning embedding-based models (node2vec, graph neural networks and hand-crafted approach) with different variants were used on the comorbidity network to compare their performance. This study also considered three commonly used similarity-based link prediction approaches (Jaccard coefficient, Adamic-Adar index and Resource allocation index) for performance comparison.

Results: The results showed that the embedding-based hand-crafted features technique achieved outstanding performance compared with the remaining similarity-based and embedding-based models. Especially, the hand-crafted technique with the extreme gradient boosting classifier achieved the highest accuracy (91.67%), followed by the same technique with the Logistic regression classifier (90.26%). For this shallow embedding method, the Jaccard coefficient and the degree centrality of the original chronic disease were the most important features for comorbidity prediction.

Conclusion: The proposed framework can be used to predict the comorbidity of chronic disease at an early stage of hospital admission. Thus, the prediction outcome could be valuable for medical practice, giving healthcare providers more control over their services and lowering expenses.

基于嵌入的链接预测,以探索慢性病的潜在共病。
目的:合并症是一个术语,用于描述患者同时患有一种以上的慢性病。共病是一个影响全世界人民的重大健康问题。本研究旨在利用机器学习和图论来预测慢性病的合并症。方法:以行政索赔数据为基础,构建患者疾病二分图。使用二分图投影方法来创建共病网络。对于链接预测任务,在共病网络上使用了三个具有不同变体的基于图机器学习嵌入的模型(node2vec、图神经网络和手工方法)来比较它们的性能。本研究还考虑了三种常用的基于相似性的链路预测方法(Jaccard系数、Adamic-Adar指数和资源分配指数)进行性能比较。结果:与其余的基于相似性和基于嵌入的模型相比,基于嵌入的手工特征技术取得了优异的性能。特别是,手工制作的极端梯度增强分类器的准确率最高(91.67%),其次是与Logistic回归分类器相同的技术(90.26%)。对于这种浅嵌入方法,Jaccard系数和原始慢性病的程度中心性是预测共病的最重要特征。结论:该框架可用于预测住院早期的慢性病合并症。因此,预测结果可能对医疗实践有价值,使医疗保健提供者能够更好地控制他们的服务并降低费用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
11.30
自引率
5.00%
发文量
30
期刊介绍: Health Information Science and Systems is a multidisciplinary journal that integrates artificial intelligence/computer science/information technology with health science and services, embracing information science research coupled with topics related to the modeling, design, development, integration and management of health information systems, smart health, artificial intelligence in medicine, and computer aided diagnosis, medical expert systems. The scope includes: i.) smart health, artificial Intelligence in medicine, computer aided diagnosis, medical image processing, medical expert systems ii.) medical big data, medical/health/biomedicine information resources such as patient medical records, devices and equipments, software and tools to capture, store, retrieve, process, analyze, optimize the use of information in the health domain, iii.) data management, data mining, and knowledge discovery, all of which play a key role in decision making, management of public health, examination of standards, privacy and security issues, iv.) development of new architectures and applications for health information systems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信