Mengsha Fu, Jiabin Yuan, Menglin Lu, Pengfei Hong, M. Zeng
{"title":"An Ensemble Machine Learning Model for the Early Detection of Sepsis from Clinical Data","authors":"Mengsha Fu, Jiabin Yuan, Menglin Lu, Pengfei Hong, M. Zeng","doi":"10.22489/cinc.2019.317","DOIUrl":null,"url":null,"abstract":"Sepsis is a life-threatening disease with high mortality and expensive cost of treatment. In order to improve the outcomes of patients, it is important to detect atrisk patients with sepsis at an early stage. The PhysioNet/Computing in Cardiology Challenge 2019 focused on improving predicting sepsis six hours before the clinical diagnosis by using the latest definition of Sepsis-3. A total of 40,336 ICU patients were provided as public training data, A hidden test dataset was used to evaluate. An ensemble model, which combined boosting and bagging tree models (lightgbm, xgboost and random forest ) were designed to predict sepsis based on the records of the patient’s hourly data. We compared the ensemble model and each single model of evaluation metrics results on selected inner test data Offline, the best performance was achieved AUC of 0.792, ACC of 0.727. Finally, the proposed model was evaluated on the full test sets received an official utility score, defined by the organizers, was 0.087, ranked 75/105 (our team name: cinc sepsis pass). While the single model of lightgbm only received a utility score of -0.036. The ensemble model utilized the preprocessing data and achieved better performance than a single tree-based model.","PeriodicalId":6716,"journal":{"name":"2019 Computing in Cardiology Conference (CinC)","volume":"58 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 Computing in Cardiology Conference (CinC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.22489/cinc.2019.317","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
Sepsis is a life-threatening disease with high mortality and expensive cost of treatment. In order to improve the outcomes of patients, it is important to detect atrisk patients with sepsis at an early stage. The PhysioNet/Computing in Cardiology Challenge 2019 focused on improving predicting sepsis six hours before the clinical diagnosis by using the latest definition of Sepsis-3. A total of 40,336 ICU patients were provided as public training data, A hidden test dataset was used to evaluate. An ensemble model, which combined boosting and bagging tree models (lightgbm, xgboost and random forest ) were designed to predict sepsis based on the records of the patient’s hourly data. We compared the ensemble model and each single model of evaluation metrics results on selected inner test data Offline, the best performance was achieved AUC of 0.792, ACC of 0.727. Finally, the proposed model was evaluated on the full test sets received an official utility score, defined by the organizers, was 0.087, ranked 75/105 (our team name: cinc sepsis pass). While the single model of lightgbm only received a utility score of -0.036. The ensemble model utilized the preprocessing data and achieved better performance than a single tree-based model.