Comparing ensemble learning algorithms and severity of illness scoring systems in cardiac intensive care units: a retrospective study.

IF 1.1 Q2 MEDICINE, GENERAL & INTERNAL

Einstein-Sao Paulo Pub Date : 2024-10-14 eCollection Date: 2024-01-01 DOI:10.31744/einstein_journal/2024AO0467

Beatriz Nistal-Nuño

{"title":"Comparing ensemble learning algorithms and severity of illness scoring systems in cardiac intensive care units: a retrospective study.","authors":"Beatriz Nistal-Nuño","doi":"10.31744/einstein_journal/2024AO0467","DOIUrl":null,"url":null,"abstract":"Background: Beatriz Nistal-Nuño designed a machine learning system type of ensemble learning for patients undergoing cardiac surgery and intensive care unit cardiology patients, based on sequences of cardiovascular physiological measurements and other intensive care unit physiological measurements in addition to static features, which generates a score for prediction of mortality of cardiac intensive care unit patients.Background: ■ Gradient Boosting Machine and Random Forest models were built for prediction of mortality at cardiac intensive care units.Background: ■ A total of 9,761 intensive care unit stays of patients admitted under a Cardiac Surgery and Cardiac Medical services were studied.Background: ■ The AUROC and AUPRC values were significantly superior to seven conventional systems compared.Background: ■ The machine learning models' calibration curves were substantially closer to the ideal line.Objective: Logistic Regression has been used traditionally for the development of most predictor tools of intensive care unit mortality. The purpose of this study is to combine shared risk factors between patients undergoing cardiac surgery and intensive care unit cardiology patients to develop a risk score for prediction of mortality in cardiac intensive care unit patients, using machine learning.Methods: Gradient Boosting Machine and Distributed Random Forest models were developed based on 9,761 intensive care unit-stays from the MIMIC-III database. Sequential and static features were collected. The primary endpoint was intensive care unit mortality prediction. Discrimination, calibration, and accuracy statistics were evaluated. The predictive performance of traditional scoring systems was compared.Results: Machine learning models' AUROC and AUPRC were significantly superior to all conventional systems for the primary endpoint (p<0.05), with AUROC of 0.9413 for Gradient Boosting Machine and 0.9311 for Distributed Random Forest. Sensitivity was 0.6421 for Gradient Boosting Machine, 0.6 for Distributed Random Forest, and <0.3 for all conventional systems except for serial SOFA (0.6316). Precision was 0.574 for Gradient Boosting Machine, 0.566 for Distributed Random Forest, and <0.5 for all conventional systems. Diagnostic odds ratio was 58.8144 for Gradient Boosting Machine, 51.2926 for Distributed Random Forest and <34 for all conventional systems. Brier score was 0.025 for Gradient Boosting Machine and 0.028 for Distributed Random Forest, being worse for the traditional systems. Calibration curves of Gradient Boosting Machine and Distributed Random Forest were substantially closer to the ideal line.Conclusion: The machine learning models showed superiority over the traditional scoring systems compared, with Gradient Boosting Machine having the best performance. Discrimination and calibration were excellent for Gradient Boosting Machine, followed by Distributed Random Forest. The machine learning methods exhibited better capacity for most accuracy statistics.","PeriodicalId":47359,"journal":{"name":"Einstein-Sao Paulo","volume":"22 ","pages":"eAO0467"},"PeriodicalIF":1.1000,"publicationDate":"2024-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Einstein-Sao Paulo","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.31744/einstein_journal/2024AO0467","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"MEDICINE, GENERAL & INTERNAL","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Beatriz Nistal-Nuño designed a machine learning system type of ensemble learning for patients undergoing cardiac surgery and intensive care unit cardiology patients, based on sequences of cardiovascular physiological measurements and other intensive care unit physiological measurements in addition to static features, which generates a score for prediction of mortality of cardiac intensive care unit patients.

Background: ■ Gradient Boosting Machine and Random Forest models were built for prediction of mortality at cardiac intensive care units.

Background: ■ A total of 9,761 intensive care unit stays of patients admitted under a Cardiac Surgery and Cardiac Medical services were studied.

Background: ■ The AUROC and AUPRC values were significantly superior to seven conventional systems compared.

Background: ■ The machine learning models' calibration curves were substantially closer to the ideal line.

Objective: Logistic Regression has been used traditionally for the development of most predictor tools of intensive care unit mortality. The purpose of this study is to combine shared risk factors between patients undergoing cardiac surgery and intensive care unit cardiology patients to develop a risk score for prediction of mortality in cardiac intensive care unit patients, using machine learning.

Methods: Gradient Boosting Machine and Distributed Random Forest models were developed based on 9,761 intensive care unit-stays from the MIMIC-III database. Sequential and static features were collected. The primary endpoint was intensive care unit mortality prediction. Discrimination, calibration, and accuracy statistics were evaluated. The predictive performance of traditional scoring systems was compared.

Results: Machine learning models' AUROC and AUPRC were significantly superior to all conventional systems for the primary endpoint (p<0.05), with AUROC of 0.9413 for Gradient Boosting Machine and 0.9311 for Distributed Random Forest. Sensitivity was 0.6421 for Gradient Boosting Machine, 0.6 for Distributed Random Forest, and <0.3 for all conventional systems except for serial SOFA (0.6316). Precision was 0.574 for Gradient Boosting Machine, 0.566 for Distributed Random Forest, and <0.5 for all conventional systems. Diagnostic odds ratio was 58.8144 for Gradient Boosting Machine, 51.2926 for Distributed Random Forest and <34 for all conventional systems. Brier score was 0.025 for Gradient Boosting Machine and 0.028 for Distributed Random Forest, being worse for the traditional systems. Calibration curves of Gradient Boosting Machine and Distributed Random Forest were substantially closer to the ideal line.

Conclusion: The machine learning models showed superiority over the traditional scoring systems compared, with Gradient Boosting Machine having the best performance. Discrimination and calibration were excellent for Gradient Boosting Machine, followed by Distributed Random Forest. The machine learning methods exhibited better capacity for most accuracy statistics.

查看原文本刊更多论文

比较心脏重症监护病房中的集合学习算法和病情严重程度评分系统：一项回顾性研究。

背景介绍Beatriz Nistal-Nuño 设计了一种针对心脏手术患者和重症监护室心脏病患者的集合学习型机器学习系统，该系统除静态特征外，还基于心血管生理测量和其他重症监护室生理测量的序列，可生成预测心脏重症监护室患者死亡率的评分：建立了梯度提升机和随机森林模型，用于预测心脏重症监护病房的死亡率：研究了心脏外科和心脏内科住院患者在重症监护室的9761次住院情况：AUROC和AUPRC值明显优于七种传统系统：机器学习模型的校准曲线更接近理想线：传统上，逻辑回归一直被用于开发重症监护病房死亡率的大多数预测工具。本研究的目的是结合心脏手术患者和重症监护室心脏病患者的共同风险因素，利用机器学习技术开发出预测心脏重症监护室患者死亡率的风险评分：方法：基于 MIMIC-III 数据库中的 9761 例重症监护病房住院病例，开发了梯度提升机器和分布式随机森林模型。收集了序列特征和静态特征。主要终点是重症监护病房死亡率预测。对判别、校准和准确性统计进行了评估。比较了传统评分系统的预测性能：结果：就主要终点（p）而言，机器学习模型的 AUROC 和 AUPRC 明显优于所有传统系统：机器学习模型优于传统评分系统，其中梯度提升机的性能最佳。梯度推移机的判别和校准能力极佳，分布式随机森林次之。机器学习方法在大多数准确率统计方面都表现出了更好的能力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊