A Time Series Clinical Data-driven Preprocessing Approach to Early Sepsis Diagnosis

SadikAref SadikAref, Lachin Fernando, Sindhu Ghanta
{"title":"A Time Series Clinical Data-driven Preprocessing Approach to Early Sepsis Diagnosis","authors":"SadikAref SadikAref, Lachin Fernando, Sindhu Ghanta","doi":"10.1109/CAI54212.2023.00070","DOIUrl":null,"url":null,"abstract":"Sepsis, leading to an estimated 11 million deaths per year, is often left undiagnosed due to its heterogeneity and lack of a single diagnostic test [3]. Every hour of delay in sepsis treatment increases the mortality rate by 4-8%, making early diagnosis and medical intervention critical to saving lives [1].Although several machine learning models have been developed using clinical data, their performance has been unsatisfactory, with low sensitivity scores leading to high mortality. To overcome this, a unique segmentation method is applied to a large time series clinical dataset of 40,336 patients, including 2,932 sepsis and 37,404 nonsepsis cases, comprising 41 variables of laboratory values, vital signs, and demographic data. Multiple experiments are conducted using different machine learning algorithms such as K-Nearest Neighbors, Random Forest, Multi-Layer Perceptron, and Gradient Boosting. The findings reveal that the XGB algorithm with a six-hour early prediction outperforms other models with a recall value of 0.98 and AUROC of 0.98 in predicting sepsis onset. Additionally, the use of data from 12 hours before onset results in a performance recall of 0.86 and AUROC of 0.95. These results demonstrate the potential of utilizing machine learning algorithms for early sepsis detection and highlight the importance of time series data segmentation and feature engineering for improved model performance.","PeriodicalId":129324,"journal":{"name":"2023 IEEE Conference on Artificial Intelligence (CAI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE Conference on Artificial Intelligence (CAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CAI54212.2023.00070","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Sepsis, leading to an estimated 11 million deaths per year, is often left undiagnosed due to its heterogeneity and lack of a single diagnostic test [3]. Every hour of delay in sepsis treatment increases the mortality rate by 4-8%, making early diagnosis and medical intervention critical to saving lives [1].Although several machine learning models have been developed using clinical data, their performance has been unsatisfactory, with low sensitivity scores leading to high mortality. To overcome this, a unique segmentation method is applied to a large time series clinical dataset of 40,336 patients, including 2,932 sepsis and 37,404 nonsepsis cases, comprising 41 variables of laboratory values, vital signs, and demographic data. Multiple experiments are conducted using different machine learning algorithms such as K-Nearest Neighbors, Random Forest, Multi-Layer Perceptron, and Gradient Boosting. The findings reveal that the XGB algorithm with a six-hour early prediction outperforms other models with a recall value of 0.98 and AUROC of 0.98 in predicting sepsis onset. Additionally, the use of data from 12 hours before onset results in a performance recall of 0.86 and AUROC of 0.95. These results demonstrate the potential of utilizing machine learning algorithms for early sepsis detection and highlight the importance of time series data segmentation and feature engineering for improved model performance.
时间序列临床数据驱动的预处理方法在脓毒症早期诊断中的应用
脓毒症每年导致约1100万人死亡,由于其异质性和缺乏单一诊断方法,脓毒症常常未被诊断[3]。脓毒症治疗每延迟1小时,死亡率就会增加4-8%,因此早期诊断和医疗干预对挽救生命至关重要[1]。尽管已经使用临床数据开发了几种机器学习模型,但它们的性能并不令人满意,低灵敏度评分导致高死亡率。为了克服这一问题,研究人员将一种独特的分割方法应用于40,336例患者的大型时间序列临床数据集,其中包括2932例败血症和37,404例非败血症,包括41个变量,包括实验室值,生命体征和人口统计学数据。使用不同的机器学习算法(如k近邻、随机森林、多层感知器和梯度增强)进行了多个实验。研究结果表明,具有6小时早期预测能力的XGB算法在预测脓毒症发病方面优于其他模型,召回率为0.98,AUROC为0.98。此外,使用发病前12小时的数据,其性能回忆率为0.86,AUROC为0.95。这些结果证明了利用机器学习算法进行早期败血症检测的潜力,并强调了时间序列数据分割和特征工程对提高模型性能的重要性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信