{"title":"An interpretable machine learning model to predict hospitalizations","authors":"Hagar Elbatanouny , Hissam Tawfik , Tarek Khater , Anatoliy Gorbenko","doi":"10.1016/j.ceh.2025.03.004","DOIUrl":null,"url":null,"abstract":"<div><div>Hospital management plays a pivotal role in ensuring the efficient delivery of medical services, especially in the face of challenges posed by pandemics such as COVID-19. This paper explores the application of machine learning techniques in addressing the challenge of hospitalization during pandemics. Leveraging a comprehensive dataset sourced from the Mexican government, various supervised learning algorithms including Random Forest, Gradient Boosting, Support Vector Machine, K-Nearest Neighbors, and Multilayer Perceptron are trained and evaluated to discern factors contributing to hospitalizations. Feature importance analysis and dimensionality reduction techniques are employed to enhance models predictive performance. The best model was Gradient Boosting algorithm with an accuracy of 85.63% and AUC score of 0.8696. The interpretability plots showed that pneumonia had a positive impact on the hospitalization prediction of the model. Our analysis indicates that women aged over 45 with pneumonia and concurrent COVID-19 exhibit the highest likelihood of hospitalization. This study underscores the potential of interpretable machine learning in aiding hospital managers to optimize resource allocation, hospitalization cases, and make data-driven decisions during pandemics.</div></div>","PeriodicalId":100268,"journal":{"name":"Clinical eHealth","volume":"8 ","pages":"Pages 53-65"},"PeriodicalIF":0.0000,"publicationDate":"2025-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical eHealth","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2588914125000140","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Hospital management plays a pivotal role in ensuring the efficient delivery of medical services, especially in the face of challenges posed by pandemics such as COVID-19. This paper explores the application of machine learning techniques in addressing the challenge of hospitalization during pandemics. Leveraging a comprehensive dataset sourced from the Mexican government, various supervised learning algorithms including Random Forest, Gradient Boosting, Support Vector Machine, K-Nearest Neighbors, and Multilayer Perceptron are trained and evaluated to discern factors contributing to hospitalizations. Feature importance analysis and dimensionality reduction techniques are employed to enhance models predictive performance. The best model was Gradient Boosting algorithm with an accuracy of 85.63% and AUC score of 0.8696. The interpretability plots showed that pneumonia had a positive impact on the hospitalization prediction of the model. Our analysis indicates that women aged over 45 with pneumonia and concurrent COVID-19 exhibit the highest likelihood of hospitalization. This study underscores the potential of interpretable machine learning in aiding hospital managers to optimize resource allocation, hospitalization cases, and make data-driven decisions during pandemics.