COVID-19 中高流量鼻导管故障预测模型准确性的种族差异

Critical Care Explorations Pub Date : 2024-03-01 DOI:10.1097/cce.0000000000001059

Philip Yang, I.A. Gregory, Chad Robichaux, Andre L. Holder, Greg S. Martin, Annette M. Esper, R. Kamaleswaran, Judy W. Gichoya, S. Bhavani

{"title":"COVID-19 中高流量鼻导管故障预测模型准确性的种族差异","authors":"Philip Yang, I.A. Gregory, Chad Robichaux, Andre L. Holder, Greg S. Martin, Annette M. Esper, R. Kamaleswaran, Judy W. Gichoya, S. Bhavani","doi":"10.1097/cce.0000000000001059","DOIUrl":null,"url":null,"abstract":"\n \n To develop and validate machine learning (ML) models to predict high-flow nasal cannula (HFNC) failure in COVID-19, compare their performance to the respiratory rate-oxygenation (ROX) index, and evaluate model accuracy by self-reported race.\n \n \n \n Retrospective cohort study.\n \n \n \n Four Emory University Hospitals in Atlanta, GA.\n \n \n \n Adult patients hospitalized with COVID-19 between March 2020 and April 2022 who received HFNC therapy within 24 hours of ICU admission were included.\n \n \n \n None.\n \n \n \n Four types of supervised ML models were developed for predicting HFNC failure (defined as intubation or death within 7 d of HFNC initiation), using routine clinical variables from the first 24 hours of ICU admission. Models were trained on the first 60% (n = 594) of admissions and validated on the latter 40% (n = 390) of admissions to simulate prospective implementation. Among 984 patients included, 317 patients (32.2%) developed HFNC failure. eXtreme Gradient Boosting (XGB) model had the highest area under the receiver-operator characteristic curve (AUROC) for predicting HFNC failure (0.707), and was the only model with significantly better performance than the ROX index (AUROC 0.616). XGB model had significantly worse performance in Black patients compared with White patients (AUROC 0.663 vs. 0.808, p = 0.02). Racial differences in the XGB model were reduced and no longer statistically significant when restricted to patients with nonmissing arterial blood gas data, and when XGB model was developed to predict mortality (rather than the composite outcome of failure, which could be influenced by biased clinical decisions for intubation).\n \n \n \n Our XGB model had better discrimination for predicting HFNC failure in COVID-19 than the ROX index, but had racial differences in accuracy of predictions. Further studies are needed to understand and mitigate potential sources of biases in clinical ML models and to improve their equitability.\n","PeriodicalId":10759,"journal":{"name":"Critical Care Explorations","volume":"364 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Racial Differences in Accuracy of Predictive Models for High-Flow Nasal Cannula Failure in COVID-19\",\"authors\":\"Philip Yang, I.A. Gregory, Chad Robichaux, Andre L. Holder, Greg S. Martin, Annette M. Esper, R. Kamaleswaran, Judy W. Gichoya, S. Bhavani\",\"doi\":\"10.1097/cce.0000000000001059\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n \\n To develop and validate machine learning (ML) models to predict high-flow nasal cannula (HFNC) failure in COVID-19, compare their performance to the respiratory rate-oxygenation (ROX) index, and evaluate model accuracy by self-reported race.\\n \\n \\n \\n Retrospective cohort study.\\n \\n \\n \\n Four Emory University Hospitals in Atlanta, GA.\\n \\n \\n \\n Adult patients hospitalized with COVID-19 between March 2020 and April 2022 who received HFNC therapy within 24 hours of ICU admission were included.\\n \\n \\n \\n None.\\n \\n \\n \\n Four types of supervised ML models were developed for predicting HFNC failure (defined as intubation or death within 7 d of HFNC initiation), using routine clinical variables from the first 24 hours of ICU admission. Models were trained on the first 60% (n = 594) of admissions and validated on the latter 40% (n = 390) of admissions to simulate prospective implementation. Among 984 patients included, 317 patients (32.2%) developed HFNC failure. eXtreme Gradient Boosting (XGB) model had the highest area under the receiver-operator characteristic curve (AUROC) for predicting HFNC failure (0.707), and was the only model with significantly better performance than the ROX index (AUROC 0.616). XGB model had significantly worse performance in Black patients compared with White patients (AUROC 0.663 vs. 0.808, p = 0.02). Racial differences in the XGB model were reduced and no longer statistically significant when restricted to patients with nonmissing arterial blood gas data, and when XGB model was developed to predict mortality (rather than the composite outcome of failure, which could be influenced by biased clinical decisions for intubation).\\n \\n \\n \\n Our XGB model had better discrimination for predicting HFNC failure in COVID-19 than the ROX index, but had racial differences in accuracy of predictions. Further studies are needed to understand and mitigate potential sources of biases in clinical ML models and to improve their equitability.\\n\",\"PeriodicalId\":10759,\"journal\":{\"name\":\"Critical Care Explorations\",\"volume\":\"364 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Critical Care Explorations\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1097/cce.0000000000001059\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Critical Care Explorations","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1097/cce.0000000000001059","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

目的：开发并验证机器学习（ML）模型，以预测 COVID-19 中高流量鼻插管（HFNC）的故障，将其性能与呼吸速率-氧合（ROX）指数进行比较，并根据自我报告的种族评估模型的准确性。回顾性队列研究。佐治亚州亚特兰大市的四家埃默里大学医院。纳入 2020 年 3 月至 2022 年 4 月期间因 COVID-19 住院的成人患者，这些患者在入住重症监护室后 24 小时内接受了 HFNC 治疗。无。利用 ICU 入院后 24 小时内的常规临床变量，开发了四种类型的监督 ML 模型，用于预测 HFNC 失败（定义为 HFNC 启动后 7 d 内插管或死亡）。对前 60% 的入院患者（n = 594）进行了模型训练，并对后 40% 的入院患者（n = 390）进行了验证，以模拟前瞻性实施。在纳入的 984 名患者中，有 317 名患者（32.2%）出现 HFNC 失败。在预测 HFNC 失败方面，eXtreme Gradient Boosting (XGB) 模型的接收器-操作者特征曲线下面积 (AUROC) 最高（0.707），是唯一一个性能明显优于 ROX 指数（AUROC 0.616）的模型。与白人患者相比，XGB 模型在黑人患者中的表现明显较差（AUROC 0.663 vs. 0.808，p = 0.02）。如果仅限于动脉血气数据未缺失的患者，并且 XGB 模型是用于预测死亡率（而不是预测失败的综合结果，因为失败的综合结果可能会受到有偏见的插管临床决策的影响），那么 XGB 模型的种族差异就会缩小，不再具有统计学意义。与 ROX 指数相比，我们的 XGB 模型在预测 COVID-19 中 HFNC 失败方面具有更好的分辨能力，但在预测准确性方面存在种族差异。需要进一步研究以了解和减轻临床 ML 模型中潜在的偏差来源，并提高其公平性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Racial Differences in Accuracy of Predictive Models for High-Flow Nasal Cannula Failure in COVID-19

To develop and validate machine learning (ML) models to predict high-flow nasal cannula (HFNC) failure in COVID-19, compare their performance to the respiratory rate-oxygenation (ROX) index, and evaluate model accuracy by self-reported race. Retrospective cohort study. Four Emory University Hospitals in Atlanta, GA. Adult patients hospitalized with COVID-19 between March 2020 and April 2022 who received HFNC therapy within 24 hours of ICU admission were included. None. Four types of supervised ML models were developed for predicting HFNC failure (defined as intubation or death within 7 d of HFNC initiation), using routine clinical variables from the first 24 hours of ICU admission. Models were trained on the first 60% (n = 594) of admissions and validated on the latter 40% (n = 390) of admissions to simulate prospective implementation. Among 984 patients included, 317 patients (32.2%) developed HFNC failure. eXtreme Gradient Boosting (XGB) model had the highest area under the receiver-operator characteristic curve (AUROC) for predicting HFNC failure (0.707), and was the only model with significantly better performance than the ROX index (AUROC 0.616). XGB model had significantly worse performance in Black patients compared with White patients (AUROC 0.663 vs. 0.808, p = 0.02). Racial differences in the XGB model were reduced and no longer statistically significant when restricted to patients with nonmissing arterial blood gas data, and when XGB model was developed to predict mortality (rather than the composite outcome of failure, which could be influenced by biased clinical decisions for intubation). Our XGB model had better discrimination for predicting HFNC failure in COVID-19 than the ROX index, but had racial differences in accuracy of predictions. Further studies are needed to understand and mitigate potential sources of biases in clinical ML models and to improve their equitability.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Critical Care Explorations

自引率

0.00%

发文量