通过机器学习算法预测高风险急诊科复诊：概念验证研究

IF 4.1 Q1 HEALTH CARE SCIENCES & SERVICES

BMJ Health & Care Informatics Pub Date : 2024-04-01 DOI:10.1136/bmjhci-2023-100859

Chih-Wei Sung, Joshua Ho, Cheng-Yi Fan, Ching-Yu Chen, Chi-Hsin Chen, Shao-Yung Lin, Jia-How Chang, Jiun-Wei Chen, Edward Pei-Chuan Huang

{"title":"通过机器学习算法预测高风险急诊科复诊：概念验证研究","authors":"Chih-Wei Sung, Joshua Ho, Cheng-Yi Fan, Ching-Yu Chen, Chi-Hsin Chen, Shao-Yung Lin, Jia-How Chang, Jiun-Wei Chen, Edward Pei-Chuan Huang","doi":"10.1136/bmjhci-2023-100859","DOIUrl":null,"url":null,"abstract":"Background High-risk emergency department (ED) revisit is considered an important quality indicator that may reflect an increase in complications and medical burden. However, because of its multidimensional and highly complex nature, this factor has not been comprehensively investigated. This study aimed to predict high-risk ED revisit with a machine-learning (ML) approach. Methods This 3-year retrospective cohort study assessed adult patients between January 2019 and December 2021 from National Taiwan University Hospital Hsin-Chu Branch with high-risk ED revisit, defined as hospital or intensive care unit admission after ED return within 72 hours. A total of 150 features were preliminarily screened, and 79 were used in the prediction model. Deep learning, random forest, extreme gradient boosting (XGBoost) and stacked ensemble algorithm were used. The stacked ensemble model combined multiple ML models and performed model stacking as a meta-level algorithm. Confusion matrix, accuracy, sensitivity, specificity and area under the receiver operating characteristic curve (AUROC) were used to evaluate performance. Results Analysis was performed for 6282 eligible adult patients: 5025 (80.0%) in the training set and 1257 (20.0%) in the testing set. High-risk ED revisit occurred for 971 (19.3%) of training set patients vs 252 (20.1%) in the testing set. Leading predictors of high-risk ED revisit were age, systolic blood pressure and heart rate. The stacked ensemble model showed more favourable prediction performance (AUROC 0.82) than the other models: deep learning (0.69), random forest (0.78) and XGBoost (0.79). Also, the stacked ensemble model achieved favourable accuracy and specificity. Conclusion The stacked ensemble algorithm exhibited better prediction performance in which the predictions were generated from different ML algorithms to optimally maximise the final set of results. Patients with older age and abnormal systolic blood pressure and heart rate at the index ED visit were vulnerable to high-risk ED revisit. Further studies should be conducted to externally validate the model. Data are available on reasonable request.","PeriodicalId":9050,"journal":{"name":"BMJ Health & Care Informatics","volume":"8 1","pages":""},"PeriodicalIF":4.1000,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Prediction of high-risk emergency department revisits from a machine-learning algorithm: a proof-of-concept study\",\"authors\":\"Chih-Wei Sung, Joshua Ho, Cheng-Yi Fan, Ching-Yu Chen, Chi-Hsin Chen, Shao-Yung Lin, Jia-How Chang, Jiun-Wei Chen, Edward Pei-Chuan Huang\",\"doi\":\"10.1136/bmjhci-2023-100859\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background High-risk emergency department (ED) revisit is considered an important quality indicator that may reflect an increase in complications and medical burden. However, because of its multidimensional and highly complex nature, this factor has not been comprehensively investigated. This study aimed to predict high-risk ED revisit with a machine-learning (ML) approach. Methods This 3-year retrospective cohort study assessed adult patients between January 2019 and December 2021 from National Taiwan University Hospital Hsin-Chu Branch with high-risk ED revisit, defined as hospital or intensive care unit admission after ED return within 72 hours. A total of 150 features were preliminarily screened, and 79 were used in the prediction model. Deep learning, random forest, extreme gradient boosting (XGBoost) and stacked ensemble algorithm were used. The stacked ensemble model combined multiple ML models and performed model stacking as a meta-level algorithm. Confusion matrix, accuracy, sensitivity, specificity and area under the receiver operating characteristic curve (AUROC) were used to evaluate performance. Results Analysis was performed for 6282 eligible adult patients: 5025 (80.0%) in the training set and 1257 (20.0%) in the testing set. High-risk ED revisit occurred for 971 (19.3%) of training set patients vs 252 (20.1%) in the testing set. Leading predictors of high-risk ED revisit were age, systolic blood pressure and heart rate. The stacked ensemble model showed more favourable prediction performance (AUROC 0.82) than the other models: deep learning (0.69), random forest (0.78) and XGBoost (0.79). Also, the stacked ensemble model achieved favourable accuracy and specificity. Conclusion The stacked ensemble algorithm exhibited better prediction performance in which the predictions were generated from different ML algorithms to optimally maximise the final set of results. Patients with older age and abnormal systolic blood pressure and heart rate at the index ED visit were vulnerable to high-risk ED revisit. Further studies should be conducted to externally validate the model. Data are available on reasonable request.\",\"PeriodicalId\":9050,\"journal\":{\"name\":\"BMJ Health & Care Informatics\",\"volume\":\"8 1\",\"pages\":\"\"},\"PeriodicalIF\":4.1000,\"publicationDate\":\"2024-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BMJ Health & Care Informatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1136/bmjhci-2023-100859\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"HEALTH CARE SCIENCES & SERVICES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMJ Health & Care Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1136/bmjhci-2023-100859","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}

引用次数: 0

摘要

背景高风险急诊科（ED）再次就诊被认为是一项重要的质量指标，可能反映出并发症和医疗负担的增加。然而，由于其多维性和高度复杂性，这一因素尚未得到全面研究。本研究旨在通过机器学习（ML）方法预测高风险急诊室复诊率。方法这项为期 3 年的回顾性队列研究评估了 2019 年 1 月至 2021 年 12 月期间台大医院新竹分院的高风险 ED 再就诊成人患者。共初步筛选出 150 个特征，其中 79 个用于预测模型。使用了深度学习、随机森林、极梯度提升（XGBoost）和堆叠集合算法。堆叠集合模型结合了多个 ML 模型，作为元级算法进行模型堆叠。混淆矩阵、准确率、灵敏度、特异性和接收者工作特征曲线下面积（AUROC）用于评估性能。结果对 6282 名符合条件的成年患者进行了分析：其中 5025 人（80.0%）在训练集中，1257 人（20.0%）在测试集中。训练集患者中有 971 人（19.3%）再次到急诊室就诊，而测试集患者中有 252 人（20.1%）再次到急诊室就诊。高风险急诊室复诊的主要预测因素是年龄、收缩压和心率。与深度学习（0.69）、随机森林（0.78）和 XGBoost（0.79）等其他模型相比，堆叠集合模型的预测性能更佳（AUROC 0.82）。此外，堆叠集合模型的准确性和特异性也很高。结论叠加集合算法显示出更好的预测性能，其中的预测由不同的多重学习算法生成，以优化最大化最终结果集。在急诊室就诊时年龄较大、收缩压和心率异常的患者很容易在急诊室再次就诊。应开展进一步研究，从外部验证该模型。如有合理要求，可提供相关数据。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Prediction of high-risk emergency department revisits from a machine-learning algorithm: a proof-of-concept study

Background High-risk emergency department (ED) revisit is considered an important quality indicator that may reflect an increase in complications and medical burden. However, because of its multidimensional and highly complex nature, this factor has not been comprehensively investigated. This study aimed to predict high-risk ED revisit with a machine-learning (ML) approach. Methods This 3-year retrospective cohort study assessed adult patients between January 2019 and December 2021 from National Taiwan University Hospital Hsin-Chu Branch with high-risk ED revisit, defined as hospital or intensive care unit admission after ED return within 72 hours. A total of 150 features were preliminarily screened, and 79 were used in the prediction model. Deep learning, random forest, extreme gradient boosting (XGBoost) and stacked ensemble algorithm were used. The stacked ensemble model combined multiple ML models and performed model stacking as a meta-level algorithm. Confusion matrix, accuracy, sensitivity, specificity and area under the receiver operating characteristic curve (AUROC) were used to evaluate performance. Results Analysis was performed for 6282 eligible adult patients: 5025 (80.0%) in the training set and 1257 (20.0%) in the testing set. High-risk ED revisit occurred for 971 (19.3%) of training set patients vs 252 (20.1%) in the testing set. Leading predictors of high-risk ED revisit were age, systolic blood pressure and heart rate. The stacked ensemble model showed more favourable prediction performance (AUROC 0.82) than the other models: deep learning (0.69), random forest (0.78) and XGBoost (0.79). Also, the stacked ensemble model achieved favourable accuracy and specificity. Conclusion The stacked ensemble algorithm exhibited better prediction performance in which the predictions were generated from different ML algorithms to optimally maximise the final set of results. Patients with older age and abnormal systolic blood pressure and heart rate at the index ED visit were vulnerable to high-risk ED revisit. Further studies should be conducted to externally validate the model. Data are available on reasonable request.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

BMJ Health & Care Informatics Multiple-

CiteScore

6.10

自引率

4.90%

发文量

审稿时长

18 weeks