{"title":"Explainable machine learning models for mortality prediction in patients with sepsis in tertiary care hospital ICU in low- to middle-income countries.","authors":"Saumya Diwan, Vinay Gandhi, Esha Baidya Kayal, Puneet Khanna, Amit Mehndiratta","doi":"10.1186/s40635-025-00765-5","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>Mortality in sepsis patients remains a challenging condition due to its complex nature. It is an even more prevalent health problem in low- and middle-income countries demanding costly treatment and management. This study proposes an explainable artificial intelligence-based approach towards mortality prediction for patients with sepsis admitted to intensive care unit (ICU).</p><p><strong>Methods: </strong>A total of 500 patients (N = 500, male: female = 262:238, age = 45.96 ± 20.92 years) with sepsis were analyzed retrospectively. We utilize SHapley Additive exPlanations (SHAP) method to gain insights into the preliminary model's learnings regarding the wide array of demographic, clinical, radiological, and laboratory features. The clinical insights were used for feature selection to fetch the top t = 80% feature spread as well as to derive empirical findings from feature dependence plots which could find application in periphery hospital settings. Four machine learning algorithms, Random Forest, XGBoost, Extra Trees and Gradient Boosting classifiers were trained for the binary classification task (discharge from ICU and death in ICU) with the selected influential feature set.</p><p><strong>Results: </strong>The Extra Trees Classifier showed the best overall performance with AUROC score: 0.87 (95% CI 0.80-0.93), Accuracy: 0.79 (95% CI 0.71-0.86), F1 score: 0.78 (95% CI 0.69-0.86), Precision: 0.88 (95% CI 0.78-0.98) and Recall: 0.70 (95% CI 0.57-0.82). All four models perform significantly well on dataset with AUROC scores ranging from 0.81 (CI 0.73-0.89) to 0.87 (CI 0.80-0.93) and F1 scores ranging 0.74 (CI 0.64-0.83) to 0.78 (CI 0.69-0.86) on the hold-out test set and were stable over fivefold cross-validation prior to testing.</p><p><strong>Conclusions: </strong>The proposed approach could provide preemptive estimations into prognostication and outcome prediction of patients with sepsis in low-resource settings. This will aid in clinical decision-making, resource allocation and research for new treatment modalities.</p>","PeriodicalId":13750,"journal":{"name":"Intensive Care Medicine Experimental","volume":"13 1","pages":"56"},"PeriodicalIF":2.8000,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12133658/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intensive Care Medicine Experimental","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s40635-025-00765-5","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CRITICAL CARE MEDICINE","Score":null,"Total":0}
引用次数: 0
Abstract
Introduction: Mortality in sepsis patients remains a challenging condition due to its complex nature. It is an even more prevalent health problem in low- and middle-income countries demanding costly treatment and management. This study proposes an explainable artificial intelligence-based approach towards mortality prediction for patients with sepsis admitted to intensive care unit (ICU).
Methods: A total of 500 patients (N = 500, male: female = 262:238, age = 45.96 ± 20.92 years) with sepsis were analyzed retrospectively. We utilize SHapley Additive exPlanations (SHAP) method to gain insights into the preliminary model's learnings regarding the wide array of demographic, clinical, radiological, and laboratory features. The clinical insights were used for feature selection to fetch the top t = 80% feature spread as well as to derive empirical findings from feature dependence plots which could find application in periphery hospital settings. Four machine learning algorithms, Random Forest, XGBoost, Extra Trees and Gradient Boosting classifiers were trained for the binary classification task (discharge from ICU and death in ICU) with the selected influential feature set.
Results: The Extra Trees Classifier showed the best overall performance with AUROC score: 0.87 (95% CI 0.80-0.93), Accuracy: 0.79 (95% CI 0.71-0.86), F1 score: 0.78 (95% CI 0.69-0.86), Precision: 0.88 (95% CI 0.78-0.98) and Recall: 0.70 (95% CI 0.57-0.82). All four models perform significantly well on dataset with AUROC scores ranging from 0.81 (CI 0.73-0.89) to 0.87 (CI 0.80-0.93) and F1 scores ranging 0.74 (CI 0.64-0.83) to 0.78 (CI 0.69-0.86) on the hold-out test set and were stable over fivefold cross-validation prior to testing.
Conclusions: The proposed approach could provide preemptive estimations into prognostication and outcome prediction of patients with sepsis in low-resource settings. This will aid in clinical decision-making, resource allocation and research for new treatment modalities.