Development and external validation of machine learning models for the early prediction of malnutrition in critically ill patients: a prospective observational study.

IF 3.3 3区 医学 Q2 MEDICAL INFORMATICS
Yi Liu, Yehua Xu, Lixia Guo, Zhongbin Chen, Xueqin Xia, Feng Chen, Li Tang, Hua Jiang, Caixia Xie
{"title":"Development and external validation of machine learning models for the early prediction of malnutrition in critically ill patients: a prospective observational study.","authors":"Yi Liu, Yehua Xu, Lixia Guo, Zhongbin Chen, Xueqin Xia, Feng Chen, Li Tang, Hua Jiang, Caixia Xie","doi":"10.1186/s12911-025-03082-9","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Early detection of malnutrition in critically ill patients is crucial for timely intervention and improved clinical outcomes. However, identifying individuals at risk remains challenging due to the complexity and variability of patient conditions. This study aimed to develop and externally validate machine learning models for predicting malnutrition within 24 h of intensive care unit (ICU) admission, culminating in a web-based malnutrition prediction tool for clinical decision support.</p><p><strong>Methods: </strong>A total of 1006 critically ill adult patients (aged ≥ 18 years) were included in the model development group, and 300 adult patients comprised the external validation group. The development data were partitioned into training (80%) and testing (20%) sets. Hyperparameters were optimized via 5-fold cross-validation on the training set, eliminating the need for a separate validation set while ensuring internal validation. External validation was performed on an independent group to assess generalizability. Predictors were selected using random forest recursive feature elimination; seven machine learning models-Extreme Gradient Boosting (XGBoost), random forest, decision tree, support vector machine (SVM), Gaussian naive Bayes, k-nearest neighbor (k-NN), and logistic regression-were trained and evaluated for accuracy, precision, recall, F1 score, Area Under the Receiver Operating Characteristic Curve (AUC-ROC), Area Under the Precision-Recall Curve (AUC-PR). Model interpretability was analyzed using SHapley Additive exPlanations (SHAP) to quantify feature contributions.</p><p><strong>Results: </strong>In the development phase, among 1006 patients, 34.0% had moderate malnutrition and 17.9% severe malnutrition. The XGBoost model achieved superior predictive accuracy with an accuracy of 0.90 (95% CI = 0.86-0.94), precision of 0.92 (95% CI = 0.88-0.95), recall of 0.92 (95% CI = 0.89-0.95), F1 score of 0.92 (95% CI = 0.89-0.95), AUC-ROC of 0.98 (95% CI = 0.96-0.99), and AUC-PR of 0.97 (95% CI = 0.95-0.99) on the testing set. External validation confirmed robust performance with an accuracy of 0.75 (95% CI: 0.70-0.79), precision of 0.79 (95% CI: 0.75-0.83), recall of 0.75 (95% CI: 0.70-0.79), F1 score of 0.74 (95% CI: 0.69-0.78), AUC-ROC of 0.88 (95% CI: 0.86-0.91), and AUC-PR of 0.77 (95% CI: 0.73-0.80).</p><p><strong>Conclusions: </strong>Machine learning models, particularly XGBoost, demonstrated promising performance in early malnutrition prediction in ICU settings. The resultant web-based tool offers valuable resource for clinical decision support.</p><p><strong>Trial registration: </strong>Chinese Clinical Trial Registry ChiCTR2200058286 ( https://www.chictr.org.cn/bin/project/edit? pid=248690 ). Registered 4th April 2022. Prospectively registered.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":"25 1","pages":"248"},"PeriodicalIF":3.3000,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12225150/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Informatics and Decision Making","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12911-025-03082-9","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Early detection of malnutrition in critically ill patients is crucial for timely intervention and improved clinical outcomes. However, identifying individuals at risk remains challenging due to the complexity and variability of patient conditions. This study aimed to develop and externally validate machine learning models for predicting malnutrition within 24 h of intensive care unit (ICU) admission, culminating in a web-based malnutrition prediction tool for clinical decision support.

Methods: A total of 1006 critically ill adult patients (aged ≥ 18 years) were included in the model development group, and 300 adult patients comprised the external validation group. The development data were partitioned into training (80%) and testing (20%) sets. Hyperparameters were optimized via 5-fold cross-validation on the training set, eliminating the need for a separate validation set while ensuring internal validation. External validation was performed on an independent group to assess generalizability. Predictors were selected using random forest recursive feature elimination; seven machine learning models-Extreme Gradient Boosting (XGBoost), random forest, decision tree, support vector machine (SVM), Gaussian naive Bayes, k-nearest neighbor (k-NN), and logistic regression-were trained and evaluated for accuracy, precision, recall, F1 score, Area Under the Receiver Operating Characteristic Curve (AUC-ROC), Area Under the Precision-Recall Curve (AUC-PR). Model interpretability was analyzed using SHapley Additive exPlanations (SHAP) to quantify feature contributions.

Results: In the development phase, among 1006 patients, 34.0% had moderate malnutrition and 17.9% severe malnutrition. The XGBoost model achieved superior predictive accuracy with an accuracy of 0.90 (95% CI = 0.86-0.94), precision of 0.92 (95% CI = 0.88-0.95), recall of 0.92 (95% CI = 0.89-0.95), F1 score of 0.92 (95% CI = 0.89-0.95), AUC-ROC of 0.98 (95% CI = 0.96-0.99), and AUC-PR of 0.97 (95% CI = 0.95-0.99) on the testing set. External validation confirmed robust performance with an accuracy of 0.75 (95% CI: 0.70-0.79), precision of 0.79 (95% CI: 0.75-0.83), recall of 0.75 (95% CI: 0.70-0.79), F1 score of 0.74 (95% CI: 0.69-0.78), AUC-ROC of 0.88 (95% CI: 0.86-0.91), and AUC-PR of 0.77 (95% CI: 0.73-0.80).

Conclusions: Machine learning models, particularly XGBoost, demonstrated promising performance in early malnutrition prediction in ICU settings. The resultant web-based tool offers valuable resource for clinical decision support.

Trial registration: Chinese Clinical Trial Registry ChiCTR2200058286 ( https://www.chictr.org.cn/bin/project/edit? pid=248690 ). Registered 4th April 2022. Prospectively registered.

危重患者营养不良早期预测机器学习模型的开发和外部验证:一项前瞻性观察研究。
背景:早期发现危重症患者的营养不良对于及时干预和改善临床结果至关重要。然而,由于患者病情的复杂性和可变性,识别有风险的个体仍然具有挑战性。本研究旨在开发并外部验证用于预测重症监护病房(ICU)入院24小时内营养不良的机器学习模型,最终形成一个基于网络的营养不良预测工具,用于临床决策支持。方法:将1006例危重成人患者(≥18岁)作为模型开发组,300例成人患者作为外部验证组。开发数据被划分为训练集(80%)和测试集(20%)。通过对训练集进行5倍交叉验证来优化超参数,在确保内部验证的同时消除了对单独验证集的需要。外部验证在一个独立的组中进行,以评估通用性。采用随机森林递归特征消去法选择预测因子;七个机器学习模型-极端梯度增强(XGBoost),随机森林,决策树,支持向量机(SVM),高斯朴素贝叶斯,k-近邻(k-NN)和逻辑回归-进行了训练和评估的准确性,精密度,召回率,F1分数,接受者工作特征曲线下面积(AUC-ROC),精确度-召回率曲线下面积(AUC-PR)。模型可解释性分析采用SHapley加性解释(SHAP)来量化特征贡献。结果:1006例患者在发育阶段,中度营养不良占34.0%,重度营养不良占17.9%。XGBoost模型在测试集上的预测准确度为0.90 (95% CI = 0.86-0.94),精密度为0.92 (95% CI = 0.88-0.95),召回率为0.92 (95% CI = 0.89-0.95), F1评分为0.92 (95% CI = 0.89-0.95), AUC-ROC为0.98 (95% CI = 0.96-0.99), AUC-PR为0.97 (95% CI = 0.95-0.99)。外部验证证实了稳健的性能,准确度为0.75 (95% CI: 0.70-0.79),精密度为0.79 (95% CI: 0.75-0.83),召回率为0.75 (95% CI: 0.70-0.79), F1评分为0.74 (95% CI: 0.69-0.78), AUC-ROC为0.88 (95% CI: 0.86-0.91), AUC-PR为0.77 (95% CI: 0.73-0.80)。结论:机器学习模型,特别是XGBoost,在ICU环境下的早期营养不良预测中表现出了很好的表现。由此产生的基于网络的工具为临床决策支持提供了宝贵的资源。试验注册:中国临床试验注册中心ChiCTR2200058286 (https://www.chictr.org.cn/bin/project/edit?pid = 248690)。2022年4月4日注册。前瞻性登记。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
7.20
自引率
5.70%
发文量
297
审稿时长
1 months
期刊介绍: BMC Medical Informatics and Decision Making is an open access journal publishing original peer-reviewed research articles in relation to the design, development, implementation, use, and evaluation of health information technologies and decision-making for human health.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信