Machine learning-based predictive models for perioperative major adverse cardiovascular events in patients with stable coronary artery disease undergoing noncardiac surgery.
IF 4.9 2区 医学Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
Liang Shen, YunPeng Jin, AXiang Pan, Kai Wang, RunZe Ye, YangKai Lin, Safraz Anwar, WeiCong Xia, Min Zhou, XiaoGang Guo
{"title":"Machine learning-based predictive models for perioperative major adverse cardiovascular events in patients with stable coronary artery disease undergoing noncardiac surgery.","authors":"Liang Shen, YunPeng Jin, AXiang Pan, Kai Wang, RunZe Ye, YangKai Lin, Safraz Anwar, WeiCong Xia, Min Zhou, XiaoGang Guo","doi":"10.1016/j.cmpb.2024.108561","DOIUrl":null,"url":null,"abstract":"<p><strong>Background and objective: </strong>Accurate prediction of perioperative major adverse cardiovascular events (MACEs) is crucial, as it not only aids clinicians in comprehensively assessing patients' surgical risks and tailoring personalized surgical and perioperative management plans, but also for information-based shared decision-making with patients and efficient allocation of medical resources. This study developed and validated a machine learning (ML) model using accessible preoperative clinical data to predict perioperative MACEs in stable coronary artery disease (SCAD) patients undergoing noncardiac surgery (NCS).</p><p><strong>Methods: </strong>We collected data from 9171 adult SCAD patients who underwent NCS and extracted 64 preoperative variables. First, the optimal data imputation, resampling, and feature selection methods were compared and selected to deal with missing data values and imbalances. Then, nine independent machine learning models (logistic regression (LR), support vector machine, Gaussian Naive Bayes (GNB), random forest, gradient boosting decision tree (GBDT), extreme gradient boosting (XGBoost), light gradient boosting machine, categorical boosting (CatBoost), and deep neural network) and a stacking ensemble model were constructed and compared with the validated Revised Cardiac Risk Index's (RCRI) model for predictive performance, which was evaluated using the area under the receiver operating characteristic curve (AUROC), the area under the precision-recall curve (AUPRC), calibration curve, and decision curve analysis (DCA). To reduce overfitting and enhance robustness, we performed hyperparameter tuning and 5-fold cross-validation. Finally, the Shapley additive interpretation (SHAP) method and a partial dependence plot (PDP) were used to determine the optimal ML model.</p><p><strong>Results: </strong>Of the 9,171 patients, 514 (5.6 %) developed MACEs. 24 significant preoperative features were selected for model development and evaluation. All ML models performed well, with AUROC above 0.88 and AUPRC above 0.39, outperforming the AUROC (0.716) and AUPRC (0.185) of RCRI (P < 0.001). The best independent model was XGBoost (AUROC = 0.898, AUPRC = 0.479). The calibration curve accurately predicted the risk of MACEs (Brier score = 0.040), and the DCA results showed that XGBoost had a high net benefit for predicting MACEs. The top-ranked stacking ensemble model, consisting of CatBoost, GBDT, GNB, and LR, proved to be the best (AUROC 0.894, AUPRC 0.485). We identified the top 20 most important features using the mean absolute SHAP values and depicted their effects on model predictions using PDP.</p><p><strong>Conclusions: </strong>This study combined missing-value imputation, feature screening, unbalanced data processing, and advanced machine learning methods to successfully develop and verify the first ML-based perioperative MACEs prediction model for patients with SCAD, which is more accurate than RCRI and enables effective identification of high-risk patients and implementation of targeted interventions to reduce the incidence of MACEs.</p>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"260 ","pages":"108561"},"PeriodicalIF":4.9000,"publicationDate":"2024-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer methods and programs in biomedicine","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1016/j.cmpb.2024.108561","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
Background and objective: Accurate prediction of perioperative major adverse cardiovascular events (MACEs) is crucial, as it not only aids clinicians in comprehensively assessing patients' surgical risks and tailoring personalized surgical and perioperative management plans, but also for information-based shared decision-making with patients and efficient allocation of medical resources. This study developed and validated a machine learning (ML) model using accessible preoperative clinical data to predict perioperative MACEs in stable coronary artery disease (SCAD) patients undergoing noncardiac surgery (NCS).
Methods: We collected data from 9171 adult SCAD patients who underwent NCS and extracted 64 preoperative variables. First, the optimal data imputation, resampling, and feature selection methods were compared and selected to deal with missing data values and imbalances. Then, nine independent machine learning models (logistic regression (LR), support vector machine, Gaussian Naive Bayes (GNB), random forest, gradient boosting decision tree (GBDT), extreme gradient boosting (XGBoost), light gradient boosting machine, categorical boosting (CatBoost), and deep neural network) and a stacking ensemble model were constructed and compared with the validated Revised Cardiac Risk Index's (RCRI) model for predictive performance, which was evaluated using the area under the receiver operating characteristic curve (AUROC), the area under the precision-recall curve (AUPRC), calibration curve, and decision curve analysis (DCA). To reduce overfitting and enhance robustness, we performed hyperparameter tuning and 5-fold cross-validation. Finally, the Shapley additive interpretation (SHAP) method and a partial dependence plot (PDP) were used to determine the optimal ML model.
Results: Of the 9,171 patients, 514 (5.6 %) developed MACEs. 24 significant preoperative features were selected for model development and evaluation. All ML models performed well, with AUROC above 0.88 and AUPRC above 0.39, outperforming the AUROC (0.716) and AUPRC (0.185) of RCRI (P < 0.001). The best independent model was XGBoost (AUROC = 0.898, AUPRC = 0.479). The calibration curve accurately predicted the risk of MACEs (Brier score = 0.040), and the DCA results showed that XGBoost had a high net benefit for predicting MACEs. The top-ranked stacking ensemble model, consisting of CatBoost, GBDT, GNB, and LR, proved to be the best (AUROC 0.894, AUPRC 0.485). We identified the top 20 most important features using the mean absolute SHAP values and depicted their effects on model predictions using PDP.
Conclusions: This study combined missing-value imputation, feature screening, unbalanced data processing, and advanced machine learning methods to successfully develop and verify the first ML-based perioperative MACEs prediction model for patients with SCAD, which is more accurate than RCRI and enables effective identification of high-risk patients and implementation of targeted interventions to reduce the incidence of MACEs.
期刊介绍:
To encourage the development of formal computing methods, and their application in biomedical research and medical practice, by illustration of fundamental principles in biomedical informatics research; to stimulate basic research into application software design; to report the state of research of biomedical information processing projects; to report new computer methodologies applied in biomedical areas; the eventual distribution of demonstrable software to avoid duplication of effort; to provide a forum for discussion and improvement of existing software; to optimize contact between national organizations and regional user groups by promoting an international exchange of information on formal methods, standards and software in biomedicine.
Computer Methods and Programs in Biomedicine covers computing methodology and software systems derived from computing science for implementation in all aspects of biomedical research and medical practice. It is designed to serve: biochemists; biologists; geneticists; immunologists; neuroscientists; pharmacologists; toxicologists; clinicians; epidemiologists; psychiatrists; psychologists; cardiologists; chemists; (radio)physicists; computer scientists; programmers and systems analysts; biomedical, clinical, electrical and other engineers; teachers of medical informatics and users of educational software.