{"title":"A Time Series Clinical Data-driven Preprocessing Approach to Early Sepsis Diagnosis","authors":"SadikAref SadikAref, Lachin Fernando, Sindhu Ghanta","doi":"10.1109/CAI54212.2023.00070","DOIUrl":null,"url":null,"abstract":"Sepsis, leading to an estimated 11 million deaths per year, is often left undiagnosed due to its heterogeneity and lack of a single diagnostic test [3]. Every hour of delay in sepsis treatment increases the mortality rate by 4-8%, making early diagnosis and medical intervention critical to saving lives [1].Although several machine learning models have been developed using clinical data, their performance has been unsatisfactory, with low sensitivity scores leading to high mortality. To overcome this, a unique segmentation method is applied to a large time series clinical dataset of 40,336 patients, including 2,932 sepsis and 37,404 nonsepsis cases, comprising 41 variables of laboratory values, vital signs, and demographic data. Multiple experiments are conducted using different machine learning algorithms such as K-Nearest Neighbors, Random Forest, Multi-Layer Perceptron, and Gradient Boosting. The findings reveal that the XGB algorithm with a six-hour early prediction outperforms other models with a recall value of 0.98 and AUROC of 0.98 in predicting sepsis onset. Additionally, the use of data from 12 hours before onset results in a performance recall of 0.86 and AUROC of 0.95. These results demonstrate the potential of utilizing machine learning algorithms for early sepsis detection and highlight the importance of time series data segmentation and feature engineering for improved model performance.","PeriodicalId":129324,"journal":{"name":"2023 IEEE Conference on Artificial Intelligence (CAI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE Conference on Artificial Intelligence (CAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CAI54212.2023.00070","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Sepsis, leading to an estimated 11 million deaths per year, is often left undiagnosed due to its heterogeneity and lack of a single diagnostic test [3]. Every hour of delay in sepsis treatment increases the mortality rate by 4-8%, making early diagnosis and medical intervention critical to saving lives [1].Although several machine learning models have been developed using clinical data, their performance has been unsatisfactory, with low sensitivity scores leading to high mortality. To overcome this, a unique segmentation method is applied to a large time series clinical dataset of 40,336 patients, including 2,932 sepsis and 37,404 nonsepsis cases, comprising 41 variables of laboratory values, vital signs, and demographic data. Multiple experiments are conducted using different machine learning algorithms such as K-Nearest Neighbors, Random Forest, Multi-Layer Perceptron, and Gradient Boosting. The findings reveal that the XGB algorithm with a six-hour early prediction outperforms other models with a recall value of 0.98 and AUROC of 0.98 in predicting sepsis onset. Additionally, the use of data from 12 hours before onset results in a performance recall of 0.86 and AUROC of 0.95. These results demonstrate the potential of utilizing machine learning algorithms for early sepsis detection and highlight the importance of time series data segmentation and feature engineering for improved model performance.