Developing and validating a machine learning model to predict successful next-day extubation in the ICU

medRxiv - Intensive Care and Critical Care Medicine Pub Date : 2024-06-30 DOI:10.1101/2024.06.28.24309547

Samuel W Fenske, Alec Peltekian, Mengija Kang, Nikolay S Markov, Mengou Zhu, Kevin Grudzinski, Melissa J Bak, Anna Pawlowski, Vishu Gupta, Yuwei Mao, Stanislav Bratchikov, Thomas Stoeger, Luke V Rasmussen, Alok N Choudhary, Alexander V Misharin, Benjamin D Singer, GR Scott Budinger, Richard D Wunderink, Ankit Agrawal, Catherine A Gao, NU Script Study Investigators

{"title":"Developing and validating a machine learning model to predict successful next-day extubation in the ICU","authors":"Samuel W Fenske, Alec Peltekian, Mengija Kang, Nikolay S Markov, Mengou Zhu, Kevin Grudzinski, Melissa J Bak, Anna Pawlowski, Vishu Gupta, Yuwei Mao, Stanislav Bratchikov, Thomas Stoeger, Luke V Rasmussen, Alok N Choudhary, Alexander V Misharin, Benjamin D Singer, GR Scott Budinger, Richard D Wunderink, Ankit Agrawal, Catherine A Gao, NU Script Study Investigators","doi":"10.1101/2024.06.28.24309547","DOIUrl":null,"url":null,"abstract":"Background: Criteria to identify patients who are ready to be liberated from mechanical ventilation are imprecise, often\nresulting in prolonged mechanical ventilation or reintubation, both of which are associated with adverse outcomes. Daily\nprotocol-driven assessment of the need for mechanical ventilation leads to earlier extubation but requires dedicated\npersonnel. We sought to determine whether machine learning applied to the electronic health record could predict\nsuccessful extubation.\nMethods: We examined 37 clinical features from patients from a single-center prospective cohort study of patients in our\nquaternary care medical ICU who required mechanical ventilation and underwent a bronchoalveolar lavage for known or\nsuspected pneumonia. We also tested our models on an external test set from a community hospital ICU in our health care\nsystem. We curated electronic health record data aggregated from midnight to 8AM and labeled extubation status. We\ndeployed three data encoding/imputation strategies and built XGBoost, LightGBM, logistic regression, LSTM, and RNN\nmodels to predict successful next-day extubation. We evaluated each model's performance using Area Under the Receiver\nOperating Characteristic (AUROC), Area Under the Precision Recall Curve (AUPRC), Sensitivity (Recall), Specificity, PPV\n(Precision), Accuracy, and F1-Score.\nResults: Our internal cohort included 696 patients and 9,828 ICU days, and our external cohort had 333 patients and 2,835\nICU days. The best model (LSTM) predicted successful extubation on a given ICU day with an AUROC 0.87 (95% CI 0.834-\n0.902) and the internal test set and 0.87 (95% CI 0.848-0.885) on the external test set. A Logistic Regression model\nperformed similarly (AUROC 0.86 internal test, 0.83 external test). Across multiple model types, measures previously\ndemonstrated to be important in determining readiness for extubation were found to be most informative, including plateau\npressure and Richmond Agitation Sedation Scale (RASS) score. Our model often predicted patients to be stable for\nextubation in the days preceding their actual extubation, with 63.8% of predicted extubations occurring within three days of\ntrue extubation. We also tested the best model on cases of failed extubations (requiring reintubation within two days) not\nseen by the model during training. Our best model would have identified 35.4% (17/48) of these cases in the internal test\nset and 48.1% (13/27) cases in the external test set as unlikely to be successfully extubated.\nConclusions: Machine learning models can accurately predict the likelihood of extubation on a given ICU day from data\navailable in the electronic health record. Predictions from these models are driven by clinical features that have been\nassociated with successful extubation in clinical trials.","PeriodicalId":501249,"journal":{"name":"medRxiv - Intensive Care and Critical Care Medicine","volume":"13 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv - Intensive Care and Critical Care Medicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.06.28.24309547","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Criteria to identify patients who are ready to be liberated from mechanical ventilation are imprecise, often resulting in prolonged mechanical ventilation or reintubation, both of which are associated with adverse outcomes. Daily protocol-driven assessment of the need for mechanical ventilation leads to earlier extubation but requires dedicated personnel. We sought to determine whether machine learning applied to the electronic health record could predict successful extubation. Methods: We examined 37 clinical features from patients from a single-center prospective cohort study of patients in our quaternary care medical ICU who required mechanical ventilation and underwent a bronchoalveolar lavage for known or suspected pneumonia. We also tested our models on an external test set from a community hospital ICU in our health care system. We curated electronic health record data aggregated from midnight to 8AM and labeled extubation status. We deployed three data encoding/imputation strategies and built XGBoost, LightGBM, logistic regression, LSTM, and RNN models to predict successful next-day extubation. We evaluated each model's performance using Area Under the Receiver Operating Characteristic (AUROC), Area Under the Precision Recall Curve (AUPRC), Sensitivity (Recall), Specificity, PPV (Precision), Accuracy, and F1-Score. Results: Our internal cohort included 696 patients and 9,828 ICU days, and our external cohort had 333 patients and 2,835 ICU days. The best model (LSTM) predicted successful extubation on a given ICU day with an AUROC 0.87 (95% CI 0.834- 0.902) and the internal test set and 0.87 (95% CI 0.848-0.885) on the external test set. A Logistic Regression model performed similarly (AUROC 0.86 internal test, 0.83 external test). Across multiple model types, measures previously demonstrated to be important in determining readiness for extubation were found to be most informative, including plateau pressure and Richmond Agitation Sedation Scale (RASS) score. Our model often predicted patients to be stable for extubation in the days preceding their actual extubation, with 63.8% of predicted extubations occurring within three days of true extubation. We also tested the best model on cases of failed extubations (requiring reintubation within two days) not seen by the model during training. Our best model would have identified 35.4% (17/48) of these cases in the internal test set and 48.1% (13/27) cases in the external test set as unlikely to be successfully extubated. Conclusions: Machine learning models can accurately predict the likelihood of extubation on a given ICU day from data available in the electronic health record. Predictions from these models are driven by clinical features that have been associated with successful extubation in clinical trials.

查看原文本刊更多论文

开发并验证机器学习模型，预测重症监护室次日拔管成功率

背景：识别准备脱离机械通气的患者的标准并不精确，往往导致机械通气时间延长或再次插管，而这两种情况都与不良预后有关。由日常方案驱动的机械通气需求评估可提前拔管，但需要专人负责。我们试图确定应用于电子健康记录的机器学习能否预测成功拔管：我们研究了单中心前瞻性队列研究中 37 名患者的临床特征，这些患者均来自我们的四级医疗重症监护病房，他们因已知或疑似肺炎需要机械通气并接受支气管肺泡灌洗。我们还在我们医疗系统中一家社区医院重症监护室的外部测试集上测试了我们的模型。我们收集了从午夜到上午 8 点的电子健康记录数据，并标注了拔管状态。我们采用了三种数据编码/输入策略，并建立了 XGBoost、LightGBM、逻辑回归、LSTM 和 RNN 模型来预测第二天的成功拔管情况。我们使用接收者操作特征下面积（AUROC）、精确度召回曲线下面积（AUPRC）、灵敏度（召回）、特异性、PPV（精确度）、准确度和 F1 分数评估了每个模型的性能：我们的内部队列包括 696 名患者和 9828 个重症监护室日，外部队列包括 333 名患者和 2835 个重症监护室日。最佳模型（LSTM）在特定 ICU 日预测成功拔管的 AUROC 为 0.87（95% CI 0.834-0.902），内部测试集为 0.87（95% CI 0.848-0.885），外部测试集为 0.87（95% CI 0.848-0.885）。逻辑回归模型的表现类似（内部测试 AUROC 为 0.86，外部测试 AUROC 为 0.83）。在多种类型的模型中，我们发现之前被证明对确定拔管准备情况非常重要的指标最有参考价值，包括平板压力和里士满躁动镇静量表（RASS）评分。我们的模型经常预测患者在实际拔管前几天病情稳定，63.8%的预测拔管发生在实际拔管的三天之内。我们还对模型在训练过程中未发现的拔管失败（需要在两天内重新插管）病例进行了测试。在内部测试集和外部测试集中，我们的最佳模型分别将这些病例的 35.4% （17/48）和 48.1% （13/27）识别为不可能成功拔管的病例：机器学习模型可以根据电子病历中的数据准确预测特定 ICU 日拔管的可能性。这些模型的预测是由临床试验中与成功拔管相关的临床特征驱动的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

medRxiv - Intensive Care and Critical Care Medicine

自引率

0.00%

发文量