Development and external validation of machine learning models for the early prediction of malnutrition in critically ill patients: a prospective observational study.

IF 3.3 3区医学 Q2 MEDICAL INFORMATICS

BMC Medical Informatics and Decision Making Pub Date : 2025-07-03 DOI:10.1186/s12911-025-03082-9

Yi Liu, Yehua Xu, Lixia Guo, Zhongbin Chen, Xueqin Xia, Feng Chen, Li Tang, Hua Jiang, Caixia Xie

{"title":"Development and external validation of machine learning models for the early prediction of malnutrition in critically ill patients: a prospective observational study.","authors":"Yi Liu, Yehua Xu, Lixia Guo, Zhongbin Chen, Xueqin Xia, Feng Chen, Li Tang, Hua Jiang, Caixia Xie","doi":"10.1186/s12911-025-03082-9","DOIUrl":null,"url":null,"abstract":"Background: Early detection of malnutrition in critically ill patients is crucial for timely intervention and improved clinical outcomes. However, identifying individuals at risk remains challenging due to the complexity and variability of patient conditions. This study aimed to develop and externally validate machine learning models for predicting malnutrition within 24 h of intensive care unit (ICU) admission, culminating in a web-based malnutrition prediction tool for clinical decision support.Methods: A total of 1006 critically ill adult patients (aged ≥ 18 years) were included in the model development group, and 300 adult patients comprised the external validation group. The development data were partitioned into training (80%) and testing (20%) sets. Hyperparameters were optimized via 5-fold cross-validation on the training set, eliminating the need for a separate validation set while ensuring internal validation. External validation was performed on an independent group to assess generalizability. Predictors were selected using random forest recursive feature elimination; seven machine learning models-Extreme Gradient Boosting (XGBoost), random forest, decision tree, support vector machine (SVM), Gaussian naive Bayes, k-nearest neighbor (k-NN), and logistic regression-were trained and evaluated for accuracy, precision, recall, F1 score, Area Under the Receiver Operating Characteristic Curve (AUC-ROC), Area Under the Precision-Recall Curve (AUC-PR). Model interpretability was analyzed using SHapley Additive exPlanations (SHAP) to quantify feature contributions.Results: In the development phase, among 1006 patients, 34.0% had moderate malnutrition and 17.9% severe malnutrition. The XGBoost model achieved superior predictive accuracy with an accuracy of 0.90 (95% CI = 0.86-0.94), precision of 0.92 (95% CI = 0.88-0.95), recall of 0.92 (95% CI = 0.89-0.95), F1 score of 0.92 (95% CI = 0.89-0.95), AUC-ROC of 0.98 (95% CI = 0.96-0.99), and AUC-PR of 0.97 (95% CI = 0.95-0.99) on the testing set. External validation confirmed robust performance with an accuracy of 0.75 (95% CI: 0.70-0.79), precision of 0.79 (95% CI: 0.75-0.83), recall of 0.75 (95% CI: 0.70-0.79), F1 score of 0.74 (95% CI: 0.69-0.78), AUC-ROC of 0.88 (95% CI: 0.86-0.91), and AUC-PR of 0.77 (95% CI: 0.73-0.80).Conclusions: Machine learning models, particularly XGBoost, demonstrated promising performance in early malnutrition prediction in ICU settings. The resultant web-based tool offers valuable resource for clinical decision support.Trial registration: Chinese Clinical Trial Registry ChiCTR2200058286 ( https://www.chictr.org.cn/bin/project/edit? pid=248690 ). Registered 4th April 2022. Prospectively registered.","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":"25 1","pages":"248"},"PeriodicalIF":3.3000,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12225150/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Informatics and Decision Making","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12911-025-03082-9","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Early detection of malnutrition in critically ill patients is crucial for timely intervention and improved clinical outcomes. However, identifying individuals at risk remains challenging due to the complexity and variability of patient conditions. This study aimed to develop and externally validate machine learning models for predicting malnutrition within 24 h of intensive care unit (ICU) admission, culminating in a web-based malnutrition prediction tool for clinical decision support.

Methods: A total of 1006 critically ill adult patients (aged ≥ 18 years) were included in the model development group, and 300 adult patients comprised the external validation group. The development data were partitioned into training (80%) and testing (20%) sets. Hyperparameters were optimized via 5-fold cross-validation on the training set, eliminating the need for a separate validation set while ensuring internal validation. External validation was performed on an independent group to assess generalizability. Predictors were selected using random forest recursive feature elimination; seven machine learning models-Extreme Gradient Boosting (XGBoost), random forest, decision tree, support vector machine (SVM), Gaussian naive Bayes, k-nearest neighbor (k-NN), and logistic regression-were trained and evaluated for accuracy, precision, recall, F1 score, Area Under the Receiver Operating Characteristic Curve (AUC-ROC), Area Under the Precision-Recall Curve (AUC-PR). Model interpretability was analyzed using SHapley Additive exPlanations (SHAP) to quantify feature contributions.

Results: In the development phase, among 1006 patients, 34.0% had moderate malnutrition and 17.9% severe malnutrition. The XGBoost model achieved superior predictive accuracy with an accuracy of 0.90 (95% CI = 0.86-0.94), precision of 0.92 (95% CI = 0.88-0.95), recall of 0.92 (95% CI = 0.89-0.95), F1 score of 0.92 (95% CI = 0.89-0.95), AUC-ROC of 0.98 (95% CI = 0.96-0.99), and AUC-PR of 0.97 (95% CI = 0.95-0.99) on the testing set. External validation confirmed robust performance with an accuracy of 0.75 (95% CI: 0.70-0.79), precision of 0.79 (95% CI: 0.75-0.83), recall of 0.75 (95% CI: 0.70-0.79), F1 score of 0.74 (95% CI: 0.69-0.78), AUC-ROC of 0.88 (95% CI: 0.86-0.91), and AUC-PR of 0.77 (95% CI: 0.73-0.80).

Conclusions: Machine learning models, particularly XGBoost, demonstrated promising performance in early malnutrition prediction in ICU settings. The resultant web-based tool offers valuable resource for clinical decision support.

Trial registration: Chinese Clinical Trial Registry ChiCTR2200058286 ( https://www.chictr.org.cn/bin/project/edit? pid=248690 ). Registered 4th April 2022. Prospectively registered.

查看原文本刊更多论文

危重患者营养不良早期预测机器学习模型的开发和外部验证：一项前瞻性观察研究。

背景：早期发现危重症患者的营养不良对于及时干预和改善临床结果至关重要。然而，由于患者病情的复杂性和可变性，识别有风险的个体仍然具有挑战性。本研究旨在开发并外部验证用于预测重症监护病房（ICU）入院24小时内营养不良的机器学习模型，最终形成一个基于网络的营养不良预测工具，用于临床决策支持。方法：将1006例危重成人患者（≥18岁）作为模型开发组，300例成人患者作为外部验证组。开发数据被划分为训练集（80%）和测试集（20%）。通过对训练集进行5倍交叉验证来优化超参数，在确保内部验证的同时消除了对单独验证集的需要。外部验证在一个独立的组中进行，以评估通用性。采用随机森林递归特征消去法选择预测因子；七个机器学习模型-极端梯度增强（XGBoost），随机森林，决策树，支持向量机（SVM），高斯朴素贝叶斯，k-近邻（k-NN）和逻辑回归-进行了训练和评估的准确性，精密度，召回率，F1分数，接受者工作特征曲线下面积（AUC-ROC），精确度-召回率曲线下面积（AUC-PR）。模型可解释性分析采用SHapley加性解释（SHAP）来量化特征贡献。结果：1006例患者在发育阶段，中度营养不良占34.0%，重度营养不良占17.9%。XGBoost模型在测试集上的预测准确度为0.90 (95% CI = 0.86-0.94)，精密度为0.92 (95% CI = 0.88-0.95)，召回率为0.92 (95% CI = 0.89-0.95)， F1评分为0.92 (95% CI = 0.89-0.95)， AUC-ROC为0.98 (95% CI = 0.96-0.99)， AUC-PR为0.97 （95% CI = 0.95-0.99）。外部验证证实了稳健的性能，准确度为0.75 (95% CI: 0.70-0.79)，精密度为0.79 (95% CI: 0.75-0.83)，召回率为0.75 (95% CI: 0.70-0.79)， F1评分为0.74 (95% CI: 0.69-0.78)， AUC-ROC为0.88 (95% CI: 0.86-0.91)， AUC-PR为0.77 （95% CI: 0.73-0.80）。结论：机器学习模型，特别是XGBoost，在ICU环境下的早期营养不良预测中表现出了很好的表现。由此产生的基于网络的工具为临床决策支持提供了宝贵的资源。试验注册：中国临床试验注册中心ChiCTR2200058286 (https://www.chictr.org.cn/bin/project/edit？pid = 248690)。2022年4月4日注册。前瞻性登记。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

BMC Medical Informatics and Decision Making 医学-医学：信息

CiteScore

7.20

自引率

5.70%

发文量

297

审稿时长

1 months

期刊介绍： BMC Medical Informatics and Decision Making is an open access journal publishing original peer-reviewed research articles in relation to the design, development, implementation, use, and evaluation of health information technologies and decision-making for human health.