{"title":"Multiple disease diagnoses using heterogeneous EHR curated knowledge graph and machine learning models","authors":"Shivani Dhiman, Anjali Thukral, Punam Bedi","doi":"10.1007/s10489-024-05952-7","DOIUrl":null,"url":null,"abstract":"<div><p>Artificial Intelligence (AI) can play a significant role by assisting healthcare professionals in disease diagnosis, which is a critical step towards a patient’s treatment. Most of the research work in disease diagnosis systems predicts the presence or absence of a given single disease in a patient. However, there are only a few studies on multiple disease diagnoses, i.e., on detecting the presence of more than one disease at the same time. In this paper, we propose a framework for diagnosing multiple diseases using Knowledge Graph (KG), Knowledge embeddings and Machine Learning (ML). KG is created to semantically organize heterogeneous clinical details extracted from Electronic Health Records (EHRs). Additionally, we present a detailed comparison and analysis of three disease diagnosis systems, Single Disease Single Diagnosis (SDSD), Multiple Disease Single Diagnosis (MDSD), and Multiple Disease Multiple Diagnosis (MDMD) using the MIMIC-III dataset on Chronic Heart Failure (CHF), Acute Respiratory Failure (ARF) and Acute Kidney Failure (AKF) diseases. The above disease diagnosis systems have been implemented and analysed with different ML algorithms, such as Logistic Regression (LR), Random Forest (RF), Naïve Bayes (NB), and Support Vector Machine (SVM). Besides, detecting the probability of having multiple diseases at a time, the MDMD shows comparable results in comparison to SDSD and MDSD. This is being evaluated by using the Area Under Receiver Operating Characteristic (AUROC) and the Area Under Precision-Recall Curve (AUPRC) metrics. The MDMD system based on the proposed framework for multiple disease diagnosis predicts CHF, ARF and AKF in 91%, 74% and 79% of positive cases, respectively.</p><h3>Graphical Abstract</h3><div><figure><div><div><picture><source><img></source></picture></div></div></figure></div></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 7","pages":""},"PeriodicalIF":3.4000,"publicationDate":"2025-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Intelligence","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10489-024-05952-7","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Artificial Intelligence (AI) can play a significant role by assisting healthcare professionals in disease diagnosis, which is a critical step towards a patient’s treatment. Most of the research work in disease diagnosis systems predicts the presence or absence of a given single disease in a patient. However, there are only a few studies on multiple disease diagnoses, i.e., on detecting the presence of more than one disease at the same time. In this paper, we propose a framework for diagnosing multiple diseases using Knowledge Graph (KG), Knowledge embeddings and Machine Learning (ML). KG is created to semantically organize heterogeneous clinical details extracted from Electronic Health Records (EHRs). Additionally, we present a detailed comparison and analysis of three disease diagnosis systems, Single Disease Single Diagnosis (SDSD), Multiple Disease Single Diagnosis (MDSD), and Multiple Disease Multiple Diagnosis (MDMD) using the MIMIC-III dataset on Chronic Heart Failure (CHF), Acute Respiratory Failure (ARF) and Acute Kidney Failure (AKF) diseases. The above disease diagnosis systems have been implemented and analysed with different ML algorithms, such as Logistic Regression (LR), Random Forest (RF), Naïve Bayes (NB), and Support Vector Machine (SVM). Besides, detecting the probability of having multiple diseases at a time, the MDMD shows comparable results in comparison to SDSD and MDSD. This is being evaluated by using the Area Under Receiver Operating Characteristic (AUROC) and the Area Under Precision-Recall Curve (AUPRC) metrics. The MDMD system based on the proposed framework for multiple disease diagnosis predicts CHF, ARF and AKF in 91%, 74% and 79% of positive cases, respectively.
期刊介绍:
With a focus on research in artificial intelligence and neural networks, this journal addresses issues involving solutions of real-life manufacturing, defense, management, government and industrial problems which are too complex to be solved through conventional approaches and require the simulation of intelligent thought processes, heuristics, applications of knowledge, and distributed and parallel processing. The integration of these multiple approaches in solving complex problems is of particular importance.
The journal presents new and original research and technological developments, addressing real and complex issues applicable to difficult problems. It provides a medium for exchanging scientific research and technological achievements accomplished by the international community.