非平衡样本冠状动脉搭桥术后院内死亡率预测的特征选择策略

K. Shakhgeldyan, B. Geltser, V. Rublev, Basil Shirobokov, Dan Geltser, A. Kriger
{"title":"非平衡样本冠状动脉搭桥术后院内死亡率预测的特征选择策略","authors":"K. Shakhgeldyan, B. Geltser, V. Rublev, Basil Shirobokov, Dan Geltser, A. Kriger","doi":"10.1145/3424978.3425090","DOIUrl":null,"url":null,"abstract":"The aim of the study is to develop models of intrahospital mortality (IHM) prediction on an unbalanced sample of patients with coronary artery disease (CAD) post coronary artery bypass graft (CABG) surgery. Methods. Models for IHM prediction were built following the analysis of 866 electronic case histories based on the analysis of CAD patients, revascularized with the CABG operation. The patient cohort consisted of two groups. The first included 35 (4%) patients who died within the first 30 days after CABG, the second - 831 (96%) patients with a favorable operation outcome. We analyzed 99 factors, including the results of clinical, laboratory and instrumental studies obtained before CABG. For feature compilation, classical filtering and model selection methods were used (wrapper method). The primary drawback to applying a classical approach was the unbalanced sample as one cohort only consisted of 4% of subjects. In that case, it was not possible to apply the cross-validation procedure with three types of samples, standard quality metrics and multi-category factors. Results. Features searching approach using the multi-stage selection procedure, which combined the validation of predefined predictors, filtering methods and multifactor model development based on logistic regression, random forest (RF) and artificial neural networks (ANNs) was proposed. The models' accuracy was evaluated by a combined quality metric. RF and ANNs based models allowed not only to build more accurate forecasting tools, but also assisted in verifying five additional IHM predictors.","PeriodicalId":178822,"journal":{"name":"Proceedings of the 4th International Conference on Computer Science and Application Engineering","volume":"200 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Feature Selection Strategy for Intrahospital Mortality Prediction after Coronary Artery Bypass Graft Surgery on an Unbalanced Sample\",\"authors\":\"K. Shakhgeldyan, B. Geltser, V. Rublev, Basil Shirobokov, Dan Geltser, A. Kriger\",\"doi\":\"10.1145/3424978.3425090\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The aim of the study is to develop models of intrahospital mortality (IHM) prediction on an unbalanced sample of patients with coronary artery disease (CAD) post coronary artery bypass graft (CABG) surgery. Methods. Models for IHM prediction were built following the analysis of 866 electronic case histories based on the analysis of CAD patients, revascularized with the CABG operation. The patient cohort consisted of two groups. The first included 35 (4%) patients who died within the first 30 days after CABG, the second - 831 (96%) patients with a favorable operation outcome. We analyzed 99 factors, including the results of clinical, laboratory and instrumental studies obtained before CABG. For feature compilation, classical filtering and model selection methods were used (wrapper method). The primary drawback to applying a classical approach was the unbalanced sample as one cohort only consisted of 4% of subjects. In that case, it was not possible to apply the cross-validation procedure with three types of samples, standard quality metrics and multi-category factors. Results. Features searching approach using the multi-stage selection procedure, which combined the validation of predefined predictors, filtering methods and multifactor model development based on logistic regression, random forest (RF) and artificial neural networks (ANNs) was proposed. The models' accuracy was evaluated by a combined quality metric. RF and ANNs based models allowed not only to build more accurate forecasting tools, but also assisted in verifying five additional IHM predictors.\",\"PeriodicalId\":178822,\"journal\":{\"name\":\"Proceedings of the 4th International Conference on Computer Science and Application Engineering\",\"volume\":\"200 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-10-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 4th International Conference on Computer Science and Application Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3424978.3425090\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 4th International Conference on Computer Science and Application Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3424978.3425090","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

本研究的目的是在不平衡的冠状动脉疾病(CAD)患者冠状动脉旁路移植术(CABG)手术后建立院内死亡率(IHM)预测模型。方法。IHM预测模型是在分析866例CAD患者的电子病历基础上建立的,这些患者通过冠脉搭桥手术重建了血管。患者队列包括两组。第一组有35例(4%)患者在CABG术后30天内死亡,第二组有831例(96%)患者手术结果良好。我们分析了99个因素,包括CABG前获得的临床、实验室和仪器研究结果。特征编译采用经典滤波和模型选择方法(包装法)。应用经典方法的主要缺点是样本不平衡,因为一个队列只包括4%的受试者。在这种情况下,不可能对三种类型的样品、标准质量度量和多类别因素应用交叉验证程序。结果。提出了基于逻辑回归、随机森林(RF)和人工神经网络(ann)的多因素模型开发,结合预定义预测因子验证、滤波方法和多阶段选择过程的特征搜索方法。模型的准确性通过综合质量度量来评估。基于射频和人工神经网络的模型不仅可以建立更准确的预测工具,还可以帮助验证另外五个IHM预测器。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Feature Selection Strategy for Intrahospital Mortality Prediction after Coronary Artery Bypass Graft Surgery on an Unbalanced Sample
The aim of the study is to develop models of intrahospital mortality (IHM) prediction on an unbalanced sample of patients with coronary artery disease (CAD) post coronary artery bypass graft (CABG) surgery. Methods. Models for IHM prediction were built following the analysis of 866 electronic case histories based on the analysis of CAD patients, revascularized with the CABG operation. The patient cohort consisted of two groups. The first included 35 (4%) patients who died within the first 30 days after CABG, the second - 831 (96%) patients with a favorable operation outcome. We analyzed 99 factors, including the results of clinical, laboratory and instrumental studies obtained before CABG. For feature compilation, classical filtering and model selection methods were used (wrapper method). The primary drawback to applying a classical approach was the unbalanced sample as one cohort only consisted of 4% of subjects. In that case, it was not possible to apply the cross-validation procedure with three types of samples, standard quality metrics and multi-category factors. Results. Features searching approach using the multi-stage selection procedure, which combined the validation of predefined predictors, filtering methods and multifactor model development based on logistic regression, random forest (RF) and artificial neural networks (ANNs) was proposed. The models' accuracy was evaluated by a combined quality metric. RF and ANNs based models allowed not only to build more accurate forecasting tools, but also assisted in verifying five additional IHM predictors.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信