Development and validation of inpatient mortality prediction models for patients with hyperglycemic crisis using machine learning approaches.

IF 2.8 3区医学 Q3 ENDOCRINOLOGY & METABOLISM

BMC Endocrine Disorders Pub Date : 2025-03-27 DOI:10.1186/s12902-025-01873-9

Rui He, Kebiao Zhang, Hong Li, Manping Gu

{"title":"Development and validation of inpatient mortality prediction models for patients with hyperglycemic crisis using machine learning approaches.","authors":"Rui He, Kebiao Zhang, Hong Li, Manping Gu","doi":"10.1186/s12902-025-01873-9","DOIUrl":null,"url":null,"abstract":"Background: Hyperglycemic crisis is one of the most common and severe complications of diabetes mellitus, associated with a high motarlity rate. Emergency admissions due to hyperglycemic crisis remain prevalent and challenging. This study aimed to develop and validate predictive models for in-hospital mortality risk among patients with hyperglycemic crisis admitted to the emergency department using various machine learning (ML) methods.Methods: A multi-center retrospective study was conducted across six large general adult hospitals in Chongqing, western China. Patients diagnosed with hyperglycemic crisis were identified using an electronic medical record (EMR) database. Demographics, comorbidities, clinical characteristics, laboratory results, complications, and therapeutic interventions were extracted from the medical records to construct the prognostic prediction model. Seven machine learning algorithms, including support vector machines (SVM), random forest (RF), recursive partitioning and regression trees (RPART), extreme gradient boosting with dart booster (XGBoost), multivariate adaptive regression splines (MARS), neural network (NNET), and adaptive boost (AdaBoost) were compared with logistic regression (LR) for predicting the risk of in-hospital mortality in patients with hyperglycemic crisis. Stratified random sampling was used to split the data into training (80%) and validation (20%) sets. Ten-fold cross validation was performed on the training set to optimize model hyperparameters. The sensitivity, specificity, positive and negative predictive values, area under the curve (AUC) and accuracy of all models were computed for comparative analysis.Results: A total of 1668 patients were eligible for the present study. The in-hospital mortality rate was 7.3% (121/1668). In the training set, feature importance scores were calculated for each of the eight models, and the top 10 significant features were identified. In the validation set, all models demonstrated good predictive capability, with areas under the curve value exceeding 0.9 with a F1 score between 0.632 and 0.81, except the MARS model. Six machine learning algorithm models outperformed the referred logistic regression algorithm except the MARS model. Among the selected models, RPART, RF, and SVM achieved the best performance in the selected models (AUC values were 0.970, 0.968 and 0.968, F1 score were 0.652, 0.762, 0.762 respectively). Feature importance analysis identified novel predictors including mechanical ventilation, age, Charlson Comorbidity Index, blood gas index, first 24-hour insulin dosage, and first 24-hour fluid intake.Conclusion: Most machine learning algorithms exhibited excellent performance predicting in-hospital mortality among patients with hyperglycemic crisis except the MARS model, and the best one was RPART model. These algorithms identified overlapping but different, up to 10 predictors. Early identification of high-risk patients using these models could support clinical decision-making and potentially improve the prognosis of hyperglycemic crisis patients.Clinical trial number: Not applicable.","PeriodicalId":9152,"journal":{"name":"BMC Endocrine Disorders","volume":"25 1","pages":"86"},"PeriodicalIF":2.8000,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11948940/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Endocrine Disorders","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12902-025-01873-9","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENDOCRINOLOGY & METABOLISM","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Hyperglycemic crisis is one of the most common and severe complications of diabetes mellitus, associated with a high motarlity rate. Emergency admissions due to hyperglycemic crisis remain prevalent and challenging. This study aimed to develop and validate predictive models for in-hospital mortality risk among patients with hyperglycemic crisis admitted to the emergency department using various machine learning (ML) methods.

Methods: A multi-center retrospective study was conducted across six large general adult hospitals in Chongqing, western China. Patients diagnosed with hyperglycemic crisis were identified using an electronic medical record (EMR) database. Demographics, comorbidities, clinical characteristics, laboratory results, complications, and therapeutic interventions were extracted from the medical records to construct the prognostic prediction model. Seven machine learning algorithms, including support vector machines (SVM), random forest (RF), recursive partitioning and regression trees (RPART), extreme gradient boosting with dart booster (XGBoost), multivariate adaptive regression splines (MARS), neural network (NNET), and adaptive boost (AdaBoost) were compared with logistic regression (LR) for predicting the risk of in-hospital mortality in patients with hyperglycemic crisis. Stratified random sampling was used to split the data into training (80%) and validation (20%) sets. Ten-fold cross validation was performed on the training set to optimize model hyperparameters. The sensitivity, specificity, positive and negative predictive values, area under the curve (AUC) and accuracy of all models were computed for comparative analysis.

Results: A total of 1668 patients were eligible for the present study. The in-hospital mortality rate was 7.3% (121/1668). In the training set, feature importance scores were calculated for each of the eight models, and the top 10 significant features were identified. In the validation set, all models demonstrated good predictive capability, with areas under the curve value exceeding 0.9 with a F1 score between 0.632 and 0.81, except the MARS model. Six machine learning algorithm models outperformed the referred logistic regression algorithm except the MARS model. Among the selected models, RPART, RF, and SVM achieved the best performance in the selected models (AUC values were 0.970, 0.968 and 0.968, F1 score were 0.652, 0.762, 0.762 respectively). Feature importance analysis identified novel predictors including mechanical ventilation, age, Charlson Comorbidity Index, blood gas index, first 24-hour insulin dosage, and first 24-hour fluid intake.

Conclusion: Most machine learning algorithms exhibited excellent performance predicting in-hospital mortality among patients with hyperglycemic crisis except the MARS model, and the best one was RPART model. These algorithms identified overlapping but different, up to 10 predictors. Early identification of high-risk patients using these models could support clinical decision-making and potentially improve the prognosis of hyperglycemic crisis patients.

Clinical trial number: Not applicable.

查看原文本刊更多论文

利用机器学习方法开发和验证高血糖危象患者住院死亡率预测模型。

背景：高血糖危象是糖尿病最常见、最严重的并发症之一，具有较高的死亡率。由于高血糖危机急诊入院仍然普遍和具有挑战性。本研究旨在利用各种机器学习（ML）方法，开发和验证急诊科收治的高血糖危重患者住院死亡风险的预测模型。方法：对重庆市6家大型成人综合医院进行多中心回顾性研究。诊断为高血糖危象的患者使用电子病历（EMR）数据库进行识别。从医疗记录中提取人口统计学、合并症、临床特征、实验室结果、并发症和治疗干预措施，构建预后预测模型。采用支持向量机（SVM）、随机森林（RF）、递归划分与回归树（RPART）、dart增强器的极端梯度增强（XGBoost）、多元自适应样条回归（MARS）、神经网络（NNET）和自适应增强（AdaBoost）等7种机器学习算法与logistic回归（LR）比较，预测高血糖危重患者住院死亡风险。采用分层随机抽样将数据分为训练集（80%）和验证集（20%）。对训练集进行十倍交叉验证以优化模型超参数。计算各模型的敏感性、特异性、阳性预测值和阴性预测值、曲线下面积（AUC）和准确率，进行比较分析。结果：共有1668例患者符合本研究的条件。住院死亡率为7.3%（121/1668）。在训练集中，对8个模型分别计算特征重要性得分，并识别出前10个显著特征。在验证集中，除MARS模型外，所有模型均表现出较好的预测能力，曲线值下面积均超过0.9，F1得分在0.632 ~ 0.81之间。除MARS模型外，有6个机器学习算法模型优于参考的逻辑回归算法。在所选模型中，RPART、RF和SVM在所选模型中表现最佳（AUC值分别为0.970、0.968和0.968，F1得分分别为0.652、0.762和0.762）。特征重要性分析确定了新的预测因素，包括机械通气、年龄、Charlson合并症指数、血气指数、第一个24小时胰岛素剂量和第一个24小时液体摄入量。结论：除MARS模型外，大多数机器学习算法对高血糖危重患者住院死亡率的预测效果都很好，其中以RPART模型预测效果最好。这些算法识别出重叠但不同的预测因子，最多可达10个。使用这些模型早期识别高危患者可以支持临床决策，并有可能改善高血糖危重患者的预后。临床试验号：不适用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

BMC Endocrine Disorders ENDOCRINOLOGY & METABOLISM-

CiteScore

4.40

自引率

0.00%

发文量

280

审稿时长

>12 weeks

期刊介绍： BMC Endocrine Disorders is an open access, peer-reviewed journal that considers articles on all aspects of the prevention, diagnosis and management of endocrine disorders, as well as related molecular genetics, pathophysiology, and epidemiology.