预测神经外科术后肺部并发症的机器学习模型的开发和多中心验证。

IF 7.3 3区医学 Q1 MEDICINE, GENERAL & INTERNAL

Chinese Medical Journal Pub Date : 2025-09-05 Epub Date: 2025-02-13 DOI:10.1097/CM9.0000000000003433

Ming Xu, Wenhao Zhu, Siyu Hou, Hongzhi Xu, Jingwen Xia, Liyu Lin, Hao Fu, Mingyu You, Jiafeng Wang, Zhi Xie, Xiaohong Wen, Yingwei Wang

{"title":"预测神经外科术后肺部并发症的机器学习模型的开发和多中心验证。","authors":"Ming Xu, Wenhao Zhu, Siyu Hou, Hongzhi Xu, Jingwen Xia, Liyu Lin, Hao Fu, Mingyu You, Jiafeng Wang, Zhi Xie, Xiaohong Wen, Yingwei Wang","doi":"10.1097/CM9.0000000000003433","DOIUrl":null,"url":null,"abstract":"Background: Postoperative pulmonary complications (PPCs) are major adverse events in neurosurgical patients. This study aimed to develop and validate machine learning models predicting PPCs after neurosurgery.Methods: PPCs were defined according to the European Perioperative Clinical Outcome standards as occurring within 7 postoperative days. Data of cases meeting inclusion/exclusion criteria were extracted from the anesthesia information management system to create three datasets: The development (data of Huashan Hospital, Fudan University from 2018 to 2020), temporal validation (data of Huashan Hospital, Fudan University in 2021) and external validation (data of other three hospitals in 2023) datasets. Machine learning models of six algorithms were trained using either 35 retrievable and plausible features or the 11 features selected by Lasso regression. Temporal validation was conducted for all models and the 11-feature models were also externally validated. Independent risk factors were identified and feature importance in top models was analyzed.Results: PPCs occurred in 712 of 7533 (9.5%), 258 of 2824 (9.1%), and 207 of 2300 (9.0%) patients in the development, temporal validation and external validation datasets, respectively. During cross-validation training, all models except Bayes demonstrated good discrimination with an area under the receiver operating characteristic curve (AUC) of 0.840. In temporal validation of full-feature models, deep neural network (DNN) performed the best with an AUC of 0.835 (95% confidence interval [CI]: 0.805-0.858) and a Brier score of 0.069, followed by Logistic regression (LR), random forest and XGBoost. The 11-feature models performed comparable to full-feature models with very close but statistically significantly lower AUCs, with the top models of DNN and LR in temporal and external validations. An 11-feature nomogram was drawn based on the LR algorithm and it outperformed the minimally modified Assess respiratory RIsk in Surgical patients in CATalonia (ARISCAT) and Laparoscopic Surgery Video Educational Guidelines (LAS VEGAS) scores with a higher AUC (LR: 0.824, ARISCAT: 0.672, LAS: 0.663). Independent risk factors based on multivariate LR mostly overlapped with Lasso-selected features, but lacked consistency with the important features using the Shapley additive explanation (SHAP) method of the LR model.Conclusions: The developed models, especially the DNN model and the nomogram, had good discrimination and calibration, and could be used for predicting PPCs in neurosurgical patients. The establishment of machine learning models and the ascertainment of risk factors might assist clinical decision support for improving surgical outcomes.Trial registration: ChiCTR 2100047474; https://www.chictr.org.cn/showproj.html?proj=128279 .","PeriodicalId":10183,"journal":{"name":"Chinese Medical Journal","volume":" ","pages":"2170-2179"},"PeriodicalIF":7.3000,"publicationDate":"2025-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12407168/pdf/","citationCount":"0","resultStr":"{\"title\":\"Development and multicenter validation of machine learning models for predicting postoperative pulmonary complications after neurosurgery.\",\"authors\":\"Ming Xu, Wenhao Zhu, Siyu Hou, Hongzhi Xu, Jingwen Xia, Liyu Lin, Hao Fu, Mingyu You, Jiafeng Wang, Zhi Xie, Xiaohong Wen, Yingwei Wang\",\"doi\":\"10.1097/CM9.0000000000003433\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: Postoperative pulmonary complications (PPCs) are major adverse events in neurosurgical patients. This study aimed to develop and validate machine learning models predicting PPCs after neurosurgery.Methods: PPCs were defined according to the European Perioperative Clinical Outcome standards as occurring within 7 postoperative days. Data of cases meeting inclusion/exclusion criteria were extracted from the anesthesia information management system to create three datasets: The development (data of Huashan Hospital, Fudan University from 2018 to 2020), temporal validation (data of Huashan Hospital, Fudan University in 2021) and external validation (data of other three hospitals in 2023) datasets. Machine learning models of six algorithms were trained using either 35 retrievable and plausible features or the 11 features selected by Lasso regression. Temporal validation was conducted for all models and the 11-feature models were also externally validated. Independent risk factors were identified and feature importance in top models was analyzed.Results: PPCs occurred in 712 of 7533 (9.5%), 258 of 2824 (9.1%), and 207 of 2300 (9.0%) patients in the development, temporal validation and external validation datasets, respectively. During cross-validation training, all models except Bayes demonstrated good discrimination with an area under the receiver operating characteristic curve (AUC) of 0.840. In temporal validation of full-feature models, deep neural network (DNN) performed the best with an AUC of 0.835 (95% confidence interval [CI]: 0.805-0.858) and a Brier score of 0.069, followed by Logistic regression (LR), random forest and XGBoost. The 11-feature models performed comparable to full-feature models with very close but statistically significantly lower AUCs, with the top models of DNN and LR in temporal and external validations. An 11-feature nomogram was drawn based on the LR algorithm and it outperformed the minimally modified Assess respiratory RIsk in Surgical patients in CATalonia (ARISCAT) and Laparoscopic Surgery Video Educational Guidelines (LAS VEGAS) scores with a higher AUC (LR: 0.824, ARISCAT: 0.672, LAS: 0.663). Independent risk factors based on multivariate LR mostly overlapped with Lasso-selected features, but lacked consistency with the important features using the Shapley additive explanation (SHAP) method of the LR model.Conclusions: The developed models, especially the DNN model and the nomogram, had good discrimination and calibration, and could be used for predicting PPCs in neurosurgical patients. The establishment of machine learning models and the ascertainment of risk factors might assist clinical decision support for improving surgical outcomes.Trial registration: ChiCTR 2100047474; https://www.chictr.org.cn/showproj.html?proj=128279 .\",\"PeriodicalId\":10183,\"journal\":{\"name\":\"Chinese Medical Journal\",\"volume\":\" \",\"pages\":\"2170-2179\"},\"PeriodicalIF\":7.3000,\"publicationDate\":\"2025-09-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12407168/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Chinese Medical Journal\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1097/CM9.0000000000003433\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/2/13 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"MEDICINE, GENERAL & INTERNAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Chinese Medical Journal","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1097/CM9.0000000000003433","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/2/13 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"MEDICINE, GENERAL & INTERNAL","Score":null,"Total":0}

引用次数: 0

摘要

背景：术后肺部并发症（PPCs）是神经外科患者的主要不良事件。本研究旨在开发和验证预测神经手术后PPCs的机器学习模型。方法：根据欧洲围手术期临床结局标准，PPCs定义为术后7天内发生。从麻醉信息管理系统中提取符合纳入/排除标准的病例数据，创建三个数据集：开发数据集（复旦大学华山医院2018 - 2020年数据）、时间验证数据集（复旦大学华山医院2021年数据）和外部验证数据集（其他三家医院2023年数据）。六种算法的机器学习模型使用35个可检索和可信的特征或由Lasso回归选择的11个特征进行训练。对所有模型进行时间验证，并对11个特征模型进行外部验证。识别独立风险因素，分析top模型的特征重要性。结果：在开发、时间验证和外部验证数据集中，7533例患者中有712例（9.5%）、2824例患者中有258例（9.1%）和2300例患者中有207例（9.0%）发生PPCs。在交叉验证训练中，除贝叶斯模型外，其他模型均表现出良好的识别能力，受试者工作特征曲线下面积（AUC）为0.84。在全特征模型的时间验证中，深度神经网络（deep neural network， DNN）的AUC为0.835(95%置信区间[CI]: 0.805-0.858)， Brier评分为0.069，其次是逻辑回归（logistic regression， LR）、随机森林和XGBoost。11个特征模型的表现与全特征模型相当，auc非常接近，但统计上较低，在时间和外部验证中，DNN和LR模型的表现最好。基于LR算法绘制了一个11个特征的nomogram，其AUC （LR: 0.824, ARISCAT: 0.672, LAS: 0.663）优于最小修改的加泰罗尼亚手术患者呼吸风险评估（ARISCAT）和腹腔镜手术视频教育指南（LAS VEGAS）评分。基于多元LR的独立风险因素大多与lasso选择的特征重叠，但与使用LR模型Shapley加性解释（SHAP）方法的重要特征缺乏一致性。结论：所建立的模型，尤其是DNN模型和nomogram，具有较好的判别性和定标性，可用于神经外科患者PPCs的预测。机器学习模型的建立和风险因素的确定可能有助于临床决策支持，以改善手术结果。试验注册：ChiCTR 2100047474；https://www.chictr.org.cn/showproj.html?proj = 128279。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Development and multicenter validation of machine learning models for predicting postoperative pulmonary complications after neurosurgery.

查看原文本刊更多论文

Development and multicenter validation of machine learning models for predicting postoperative pulmonary complications after neurosurgery.

Background: Postoperative pulmonary complications (PPCs) are major adverse events in neurosurgical patients. This study aimed to develop and validate machine learning models predicting PPCs after neurosurgery.

Methods: PPCs were defined according to the European Perioperative Clinical Outcome standards as occurring within 7 postoperative days. Data of cases meeting inclusion/exclusion criteria were extracted from the anesthesia information management system to create three datasets: The development (data of Huashan Hospital, Fudan University from 2018 to 2020), temporal validation (data of Huashan Hospital, Fudan University in 2021) and external validation (data of other three hospitals in 2023) datasets. Machine learning models of six algorithms were trained using either 35 retrievable and plausible features or the 11 features selected by Lasso regression. Temporal validation was conducted for all models and the 11-feature models were also externally validated. Independent risk factors were identified and feature importance in top models was analyzed.

Results: PPCs occurred in 712 of 7533 (9.5%), 258 of 2824 (9.1%), and 207 of 2300 (9.0%) patients in the development, temporal validation and external validation datasets, respectively. During cross-validation training, all models except Bayes demonstrated good discrimination with an area under the receiver operating characteristic curve (AUC) of 0.840. In temporal validation of full-feature models, deep neural network (DNN) performed the best with an AUC of 0.835 (95% confidence interval [CI]: 0.805-0.858) and a Brier score of 0.069, followed by Logistic regression (LR), random forest and XGBoost. The 11-feature models performed comparable to full-feature models with very close but statistically significantly lower AUCs, with the top models of DNN and LR in temporal and external validations. An 11-feature nomogram was drawn based on the LR algorithm and it outperformed the minimally modified Assess respiratory RIsk in Surgical patients in CATalonia (ARISCAT) and Laparoscopic Surgery Video Educational Guidelines (LAS VEGAS) scores with a higher AUC (LR: 0.824, ARISCAT: 0.672, LAS: 0.663). Independent risk factors based on multivariate LR mostly overlapped with Lasso-selected features, but lacked consistency with the important features using the Shapley additive explanation (SHAP) method of the LR model.

Conclusions: The developed models, especially the DNN model and the nomogram, had good discrimination and calibration, and could be used for predicting PPCs in neurosurgical patients. The establishment of machine learning models and the ascertainment of risk factors might assist clinical decision support for improving surgical outcomes.

Trial registration: ChiCTR 2100047474; https://www.chictr.org.cn/showproj.html?proj=128279 .

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Chinese Medical Journal 医学-医学：内科

CiteScore

9.80

自引率

4.90%

发文量

19245

审稿时长

6 months

期刊介绍： The Chinese Medical Journal (CMJ) is published semimonthly in English by the Chinese Medical Association, and is a peer reviewed general medical journal for all doctors, researchers, and health workers regardless of their medical specialty or type of employment. Established in 1887, it is the oldest medical periodical in China and is distributed worldwide. The journal functions as a window into China’s medical sciences and reflects the advances and progress in China’s medical sciences and technology. It serves the objective of international academic exchange. The journal includes Original Articles, Editorial, Review Articles, Medical Progress, Brief Reports, Case Reports, Viewpoint, Clinical Exchange, Letter,and News,etc. CMJ is abstracted or indexed in many databases including Biological Abstracts, Chemical Abstracts, Index Medicus/Medline, Science Citation Index (SCI), Current Contents, Cancerlit, Health Plan & Administration, Embase, Social Scisearch, Aidsline, Toxline, Biocommercial Abstracts, Arts and Humanities Search, Nuclear Science Abstracts, Water Resources Abstracts, Cab Abstracts, Occupation Safety & Health, etc. In 2007, the impact factor of the journal by SCI is 0.636, and the total citation is 2315.