{"title":"Early Prediction of Sepsis Using Multi-Feature Fusion Based XGBoost Learning and Bayesian Optimization","authors":"Meicheng Yang, Xingyao Wang, Hongxiang Gao, Yuwen Li, Xing Liu, Jianqing Li, Chengyu Liu","doi":"10.22489/cinc.2019.020","DOIUrl":null,"url":null,"abstract":"Early prediction of sepsis is critical in clinical practice since each hour of delayed treatment has been associated with an increase in mortality due to irreversible organ damage. This study aimed to develop an algorithm for accurately predicting the onset of sepsis in the proceeding of six hours. Firstly, we selected 37 available variates features after data pre-processing, and then extracted three kinds of features as well in this paper, including 62 missing value features, 8 scoring quantified features and 61 time series features. After that, a multi-feature fusion based XGBoost classification model was developed and was further improved by a Bayesian optimizer and an ensemble learning framework. Analysis was performed on the PhysioNet/Computing in Cardiology Challenge 2019, which provided a publicly available sepsis data sourced from 40,336 ICU patients. Finally, after searching an optimized predicted risk threshold of 0.522 through the official submissions, our team “SailOcean” applied the developed model on the full hidden test set of 24,819 ICU patients from three hospital systems and obtained a final Unormalized score (U-Score) defined by the organizers of 0.364, which was the highest unofficial score.","PeriodicalId":6716,"journal":{"name":"2019 Computing in Cardiology Conference (CinC)","volume":"17 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 Computing in Cardiology Conference (CinC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.22489/cinc.2019.020","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12
Abstract
Early prediction of sepsis is critical in clinical practice since each hour of delayed treatment has been associated with an increase in mortality due to irreversible organ damage. This study aimed to develop an algorithm for accurately predicting the onset of sepsis in the proceeding of six hours. Firstly, we selected 37 available variates features after data pre-processing, and then extracted three kinds of features as well in this paper, including 62 missing value features, 8 scoring quantified features and 61 time series features. After that, a multi-feature fusion based XGBoost classification model was developed and was further improved by a Bayesian optimizer and an ensemble learning framework. Analysis was performed on the PhysioNet/Computing in Cardiology Challenge 2019, which provided a publicly available sepsis data sourced from 40,336 ICU patients. Finally, after searching an optimized predicted risk threshold of 0.522 through the official submissions, our team “SailOcean” applied the developed model on the full hidden test set of 24,819 ICU patients from three hospital systems and obtained a final Unormalized score (U-Score) defined by the organizers of 0.364, which was the highest unofficial score.