{"title":"Early Prediction of Patient Mortality Based on Routine Laboratory Tests and Predictive Models in Critically Ill Patients","authors":"Sven Van Poucke, Ana Kovačević, M. Vukicevic","doi":"10.5772/INTECHOPEN.76988","DOIUrl":null,"url":null,"abstract":"We propose a method for quantitative analysis of predictive power of laboratory tests and early detection of mortality risk by usage of predictive models and feature selection techniques. Our method allows automatic feature selection, model selection, and evalu- ation of predictive models. Experimental evaluation was conducted on patients with renal failure admitted to ICUs (medical intensive care, surgical intensive care, cardiac, and cardiac surgery recovery units) at Boston’s Beth Israel Deaconess Medical Center. Data are extracted from Multi parameter Intelligent Monitoring in Intensive Care III (MIMIC-III) database. We built and evaluated different single (e.g. Logistic regression) and ensemble (e.g. Random Forest) learning methods. Results revealed high predictive accuracy (area under the precision-recall curve (AUPRC) values >86%) from day four, with acceptable results on the second (>81%) and third day (>85%). Random forests seem to provide the best predictive accuracy. Feature selection techniques Gini and ReliefF scored best in most cases. Lactate, white blood cells, sodium, anion gap, chloride, bicar - bonate, creatinine, urea nitrogen, potassium, glucose, INR, hemoglobin, phosphate, total bilirubin, and base excess were most predictive for hospital mortality. Ensemble learn- ing methods are able to predict hospital mortality with high accuracy, based on laboratory tests and provide ranking in predictive priority.","PeriodicalId":91437,"journal":{"name":"Advances in data mining. Industrial Conference on Data Mining","volume":"23 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2018-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in data mining. Industrial Conference on Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5772/INTECHOPEN.76988","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
We propose a method for quantitative analysis of predictive power of laboratory tests and early detection of mortality risk by usage of predictive models and feature selection techniques. Our method allows automatic feature selection, model selection, and evalu- ation of predictive models. Experimental evaluation was conducted on patients with renal failure admitted to ICUs (medical intensive care, surgical intensive care, cardiac, and cardiac surgery recovery units) at Boston’s Beth Israel Deaconess Medical Center. Data are extracted from Multi parameter Intelligent Monitoring in Intensive Care III (MIMIC-III) database. We built and evaluated different single (e.g. Logistic regression) and ensemble (e.g. Random Forest) learning methods. Results revealed high predictive accuracy (area under the precision-recall curve (AUPRC) values >86%) from day four, with acceptable results on the second (>81%) and third day (>85%). Random forests seem to provide the best predictive accuracy. Feature selection techniques Gini and ReliefF scored best in most cases. Lactate, white blood cells, sodium, anion gap, chloride, bicar - bonate, creatinine, urea nitrogen, potassium, glucose, INR, hemoglobin, phosphate, total bilirubin, and base excess were most predictive for hospital mortality. Ensemble learn- ing methods are able to predict hospital mortality with high accuracy, based on laboratory tests and provide ranking in predictive priority.
我们提出了一种通过使用预测模型和特征选择技术来定量分析实验室测试的预测能力和早期发现死亡风险的方法。我们的方法允许自动特征选择、模型选择和预测模型的评估。对波士顿贝斯以色列女执事医疗中心(Beth Israel Deaconess medical Center)重症监护室(内科重症监护室、外科重症监护室、心脏和心脏手术康复室)收治的肾功能衰竭患者进行了实验评估。数据提取自重症监护多参数智能监测III (MIMIC-III)数据库。我们建立并评估了不同的单一(如逻辑回归)和集成(如随机森林)学习方法。结果显示,从第4天开始,预测准确率较高(精确召回曲线下面积(AUPRC)值>86%),第二天(>81%)和第三天(>85%)的结果可以接受。随机森林似乎提供了最好的预测准确性。特征选择技术Gini和ReliefF在大多数情况下得分最高。乳酸、白细胞、钠、阴离子间隙、氯化物、双碳酸盐岩、肌酐、尿素氮、钾、葡萄糖、INR、血红蛋白、磷酸盐、总胆红素和碱过量是医院死亡率的最预测性指标。集成学习方法能够基于实验室测试以高精度预测医院死亡率,并提供预测优先级排序。