使用堆叠集成机器学习算法预测败血症患者的死亡率。

Journal of postgraduate medicine Pub Date : 2024-10-01 Epub Date: 2024-12-06 DOI:10.4103/jpgm.jpgm_357_24
M Babu, M Sappani, M Joy, V K Chandiraseharan, L Jeyaseelan, T D Sudarsanam
{"title":"使用堆叠集成机器学习算法预测败血症患者的死亡率。","authors":"M Babu, M Sappani, M Joy, V K Chandiraseharan, L Jeyaseelan, T D Sudarsanam","doi":"10.4103/jpgm.jpgm_357_24","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>Machine learning (ML) has been tried in predicting outcomes following sepsis. This study aims to identify the utility of stacked ensemble algorithm in predicting mortality.</p><p><strong>Methods: </strong>The study was a cohort of adults admitted to a medical unit of a tertiary care hospital with sepsis. The data were divided into a training data set (70%) and a test data set (30%). Boruta algorithm was used to identify important features. In the first phase of stacked ensemble model, weak learners such as random forest (RF), support vector machine (SVM), elastic net, and gradient boosting machine were trained. The SVM was used in phase 2 as meta learner to combine the results of all weak learners. All models were validated using test data.</p><p><strong>Results: </strong>In our cohort of 1,453 patients, the mortality rate was 27% (95% confidence interval [CI]: 25, 29). The Boruta algorithm identified inotrope use and assisted ventilation as the most important variables, which could predict mortality. The random forest outperforms (area under the curve [AUC]: 97.91%) the other algorithms. The AUCs for the other models are SVM (95.21%), GBM (93.67%), and GLM net (91.42%). However, the stacking of all the above models had an AUC of 92.14%. In the test data set, the accuracy of all methods including the RF method accuracy decreased (92.6 to 85.5%).</p><p><strong>Conclusions: </strong>The random forest showed high accuracy in train and moderate accuracy in the test data. We suggest more regional open-access intensive care databases that can aid making machine learning a bigger support for healthcare personnel.</p>","PeriodicalId":94105,"journal":{"name":"Journal of postgraduate medicine","volume":" ","pages":"209-216"},"PeriodicalIF":0.0000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11722707/pdf/","citationCount":"0","resultStr":"{\"title\":\"Prediction of mortality in sepsis patients using stacked ensemble machine learning algorithm.\",\"authors\":\"M Babu, M Sappani, M Joy, V K Chandiraseharan, L Jeyaseelan, T D Sudarsanam\",\"doi\":\"10.4103/jpgm.jpgm_357_24\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Introduction: </strong>Machine learning (ML) has been tried in predicting outcomes following sepsis. This study aims to identify the utility of stacked ensemble algorithm in predicting mortality.</p><p><strong>Methods: </strong>The study was a cohort of adults admitted to a medical unit of a tertiary care hospital with sepsis. The data were divided into a training data set (70%) and a test data set (30%). Boruta algorithm was used to identify important features. In the first phase of stacked ensemble model, weak learners such as random forest (RF), support vector machine (SVM), elastic net, and gradient boosting machine were trained. The SVM was used in phase 2 as meta learner to combine the results of all weak learners. All models were validated using test data.</p><p><strong>Results: </strong>In our cohort of 1,453 patients, the mortality rate was 27% (95% confidence interval [CI]: 25, 29). The Boruta algorithm identified inotrope use and assisted ventilation as the most important variables, which could predict mortality. The random forest outperforms (area under the curve [AUC]: 97.91%) the other algorithms. The AUCs for the other models are SVM (95.21%), GBM (93.67%), and GLM net (91.42%). However, the stacking of all the above models had an AUC of 92.14%. In the test data set, the accuracy of all methods including the RF method accuracy decreased (92.6 to 85.5%).</p><p><strong>Conclusions: </strong>The random forest showed high accuracy in train and moderate accuracy in the test data. We suggest more regional open-access intensive care databases that can aid making machine learning a bigger support for healthcare personnel.</p>\",\"PeriodicalId\":94105,\"journal\":{\"name\":\"Journal of postgraduate medicine\",\"volume\":\" \",\"pages\":\"209-216\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11722707/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of postgraduate medicine\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4103/jpgm.jpgm_357_24\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/12/6 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of postgraduate medicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4103/jpgm.jpgm_357_24","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/12/6 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

机器学习(ML)已被尝试用于预测败血症后的预后。本研究旨在确定堆叠集成算法在预测死亡率中的效用。方法:该研究是一个队列的成年人入院的医疗单位的三级保健医院败血症。数据分为训练数据集(70%)和测试数据集(30%)。采用Boruta算法识别重要特征。在堆叠集成模型的第一阶段,对随机森林(RF)、支持向量机(SVM)、弹性网(elastic net)和梯度增强机(gradient boosting machine)等弱学习器进行训练。第二阶段使用支持向量机作为元学习器,将所有弱学习器的结果结合起来。所有模型均使用试验数据进行验证。结果:在我们的1453例患者队列中,死亡率为27%(95%可信区间[CI]: 25,29)。Boruta算法将肌力使用和辅助通气确定为最重要的变量,可以预测死亡率。随机森林算法优于其他算法(曲线下面积[AUC]: 97.91%)。其他模型的auc分别为SVM(95.21%)、GBM(93.67%)和GLM net(91.42%)。然而,上述所有模型的叠加AUC为92.14%。在测试数据集中,包括RF法在内的所有方法的准确率都有所下降(从92.6下降到85.5%)。结论:随机森林在训练中具有较高的准确率,在测试数据中具有中等的准确率。我们建议更多的区域开放获取重症监护数据库,可以帮助机器学习成为医疗人员的更大支持。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Prediction of mortality in sepsis patients using stacked ensemble machine learning algorithm.

Introduction: Machine learning (ML) has been tried in predicting outcomes following sepsis. This study aims to identify the utility of stacked ensemble algorithm in predicting mortality.

Methods: The study was a cohort of adults admitted to a medical unit of a tertiary care hospital with sepsis. The data were divided into a training data set (70%) and a test data set (30%). Boruta algorithm was used to identify important features. In the first phase of stacked ensemble model, weak learners such as random forest (RF), support vector machine (SVM), elastic net, and gradient boosting machine were trained. The SVM was used in phase 2 as meta learner to combine the results of all weak learners. All models were validated using test data.

Results: In our cohort of 1,453 patients, the mortality rate was 27% (95% confidence interval [CI]: 25, 29). The Boruta algorithm identified inotrope use and assisted ventilation as the most important variables, which could predict mortality. The random forest outperforms (area under the curve [AUC]: 97.91%) the other algorithms. The AUCs for the other models are SVM (95.21%), GBM (93.67%), and GLM net (91.42%). However, the stacking of all the above models had an AUC of 92.14%. In the test data set, the accuracy of all methods including the RF method accuracy decreased (92.6 to 85.5%).

Conclusions: The random forest showed high accuracy in train and moderate accuracy in the test data. We suggest more regional open-access intensive care databases that can aid making machine learning a bigger support for healthcare personnel.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信