基于XGBoost学习和贝叶斯优化的多特征融合脓毒症早期预测

Meicheng Yang, Xingyao Wang, Hongxiang Gao, Yuwen Li, Xing Liu, Jianqing Li, Chengyu Liu
{"title":"基于XGBoost学习和贝叶斯优化的多特征融合脓毒症早期预测","authors":"Meicheng Yang, Xingyao Wang, Hongxiang Gao, Yuwen Li, Xing Liu, Jianqing Li, Chengyu Liu","doi":"10.22489/cinc.2019.020","DOIUrl":null,"url":null,"abstract":"Early prediction of sepsis is critical in clinical practice since each hour of delayed treatment has been associated with an increase in mortality due to irreversible organ damage. This study aimed to develop an algorithm for accurately predicting the onset of sepsis in the proceeding of six hours. Firstly, we selected 37 available variates features after data pre-processing, and then extracted three kinds of features as well in this paper, including 62 missing value features, 8 scoring quantified features and 61 time series features. After that, a multi-feature fusion based XGBoost classification model was developed and was further improved by a Bayesian optimizer and an ensemble learning framework. Analysis was performed on the PhysioNet/Computing in Cardiology Challenge 2019, which provided a publicly available sepsis data sourced from 40,336 ICU patients. Finally, after searching an optimized predicted risk threshold of 0.522 through the official submissions, our team “SailOcean” applied the developed model on the full hidden test set of 24,819 ICU patients from three hospital systems and obtained a final Unormalized score (U-Score) defined by the organizers of 0.364, which was the highest unofficial score.","PeriodicalId":6716,"journal":{"name":"2019 Computing in Cardiology Conference (CinC)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2019-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"Early Prediction of Sepsis Using Multi-Feature Fusion Based XGBoost Learning and Bayesian Optimization\",\"authors\":\"Meicheng Yang, Xingyao Wang, Hongxiang Gao, Yuwen Li, Xing Liu, Jianqing Li, Chengyu Liu\",\"doi\":\"10.22489/cinc.2019.020\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Early prediction of sepsis is critical in clinical practice since each hour of delayed treatment has been associated with an increase in mortality due to irreversible organ damage. This study aimed to develop an algorithm for accurately predicting the onset of sepsis in the proceeding of six hours. Firstly, we selected 37 available variates features after data pre-processing, and then extracted three kinds of features as well in this paper, including 62 missing value features, 8 scoring quantified features and 61 time series features. After that, a multi-feature fusion based XGBoost classification model was developed and was further improved by a Bayesian optimizer and an ensemble learning framework. Analysis was performed on the PhysioNet/Computing in Cardiology Challenge 2019, which provided a publicly available sepsis data sourced from 40,336 ICU patients. Finally, after searching an optimized predicted risk threshold of 0.522 through the official submissions, our team “SailOcean” applied the developed model on the full hidden test set of 24,819 ICU patients from three hospital systems and obtained a final Unormalized score (U-Score) defined by the organizers of 0.364, which was the highest unofficial score.\",\"PeriodicalId\":6716,\"journal\":{\"name\":\"2019 Computing in Cardiology Conference (CinC)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-12-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 Computing in Cardiology Conference (CinC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.22489/cinc.2019.020\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 Computing in Cardiology Conference (CinC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.22489/cinc.2019.020","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12

摘要

脓毒症的早期预测在临床实践中至关重要,因为由于不可逆的器官损伤,每延迟治疗一个小时,死亡率就会增加。本研究旨在开发一种算法来准确预测6小时内脓毒症的发生。首先,在数据预处理后,我们选择了37个可用的变量特征,然后在本文中提取了三种特征,其中缺失值特征62个,评分量化特征8个,时间序列特征61个。在此基础上,建立了基于多特征融合的XGBoost分类模型,并通过贝叶斯优化器和集成学习框架对其进行了进一步改进。对PhysioNet/Computing in Cardiology Challenge 2019进行了分析,该挑战赛提供了来自40,336名ICU患者的公开可用败血症数据。最后,我们的团队“SailOcean”在官方提交的文件中搜索到优化后的预测风险阈值0.522,并将开发的模型应用于三个医院系统的24,819名ICU患者的全隐测试集,最终得到由组织者定义的非规范化评分(U-Score) 0.364,这是最高的非官方得分。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Early Prediction of Sepsis Using Multi-Feature Fusion Based XGBoost Learning and Bayesian Optimization
Early prediction of sepsis is critical in clinical practice since each hour of delayed treatment has been associated with an increase in mortality due to irreversible organ damage. This study aimed to develop an algorithm for accurately predicting the onset of sepsis in the proceeding of six hours. Firstly, we selected 37 available variates features after data pre-processing, and then extracted three kinds of features as well in this paper, including 62 missing value features, 8 scoring quantified features and 61 time series features. After that, a multi-feature fusion based XGBoost classification model was developed and was further improved by a Bayesian optimizer and an ensemble learning framework. Analysis was performed on the PhysioNet/Computing in Cardiology Challenge 2019, which provided a publicly available sepsis data sourced from 40,336 ICU patients. Finally, after searching an optimized predicted risk threshold of 0.522 through the official submissions, our team “SailOcean” applied the developed model on the full hidden test set of 24,819 ICU patients from three hospital systems and obtained a final Unormalized score (U-Score) defined by the organizers of 0.364, which was the highest unofficial score.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信