Comparison of Machine Learning Algorithms for Sepsis Detection

Vol 4 Issue 1 Pub Date : 2022-02-28 DOI:10.33411/ijist/2022040113

Asad Ullah, Huma Qayyum, Farman Hassan, Muhammad Khateeb Khan, Auliya Rahman

{"title":"Comparison of Machine Learning Algorithms for Sepsis Detection","authors":"Asad Ullah, Huma Qayyum, Farman Hassan, Muhammad Khateeb Khan, Auliya Rahman","doi":"10.33411/ijist/2022040113","DOIUrl":null,"url":null,"abstract":"Sepsis is a very fatal disease, causing a lot of causalities all over the world, about 2, 70,000 die of Sepsis annually, thus early detection of Sepsis disease would be a remedy to prevent this disease and it would be a big relief to the family of sepsis patients. Different researchers have worked on sepsis disease detection and its prediction but still the need to have an improved model for Sepsis detection remains. We compared various machine learning algorithms for Sepsis detection and used the dataset publicly available for all the researchers at Physionet.org, the dataset contains many empty or Null values, we applied backward filling and forward filling techniques, and we calculated missing values of MAP using equation (1) which gives more precise results, we divided the 40,336 files of datasets A and B into 80% training set and 20% testing set. We applied the algorithms twice one time using vital signs and clinical values of patients and the second time using only vital signs of the patients; using vital signs only the training accuracy of KNN, Logistic Regression, Random Forest, MLP, and Decision Trees was 0.992, 0.999, 0.981, 0.981, and 0.981 respectively, while the testing accuracy of KNN, Logistic Regression, Random Forest, MLP, and Decision Trees was 0.987, 0.980, 0.983, 0.981, and 0.981 respectively, for Sepsis Label 0, the value of precision for KNN, Random Forest, Decision Trees, Logistic Regression, and MLP was 0.99, 0.98, 0.98, 0.98, and 0.98 respectively, while the value of recall for KNN, Random Forest, Decision Trees, Logistic Regression, and MLP was 1.00, 1.00, 1.00, 1.00, and 1.00 respectively; the comparison of all the above-mentioned algorithms showed that KNN leads over all the competitors regarding the accuracy, precision, and recall.","PeriodicalId":243222,"journal":{"name":"Vol 4 Issue 1","volume":"85 7-8","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Vol 4 Issue 1","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.33411/ijist/2022040113","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Sepsis is a very fatal disease, causing a lot of causalities all over the world, about 2, 70,000 die of Sepsis annually, thus early detection of Sepsis disease would be a remedy to prevent this disease and it would be a big relief to the family of sepsis patients. Different researchers have worked on sepsis disease detection and its prediction but still the need to have an improved model for Sepsis detection remains. We compared various machine learning algorithms for Sepsis detection and used the dataset publicly available for all the researchers at Physionet.org, the dataset contains many empty or Null values, we applied backward filling and forward filling techniques, and we calculated missing values of MAP using equation (1) which gives more precise results, we divided the 40,336 files of datasets A and B into 80% training set and 20% testing set. We applied the algorithms twice one time using vital signs and clinical values of patients and the second time using only vital signs of the patients; using vital signs only the training accuracy of KNN, Logistic Regression, Random Forest, MLP, and Decision Trees was 0.992, 0.999, 0.981, 0.981, and 0.981 respectively, while the testing accuracy of KNN, Logistic Regression, Random Forest, MLP, and Decision Trees was 0.987, 0.980, 0.983, 0.981, and 0.981 respectively, for Sepsis Label 0, the value of precision for KNN, Random Forest, Decision Trees, Logistic Regression, and MLP was 0.99, 0.98, 0.98, 0.98, and 0.98 respectively, while the value of recall for KNN, Random Forest, Decision Trees, Logistic Regression, and MLP was 1.00, 1.00, 1.00, 1.00, and 1.00 respectively; the comparison of all the above-mentioned algorithms showed that KNN leads over all the competitors regarding the accuracy, precision, and recall.

查看原文本刊更多论文

脓毒症检测的机器学习算法比较

脓毒症是一种非常致命的疾病，在世界范围内造成了大量的伤亡，每年约有27万人死于脓毒症，因此早期发现脓毒症是预防这种疾病的一种补救措施，对脓毒症患者的家庭来说是一个很大的安慰。不同的研究人员对脓毒症疾病的检测及其预测进行了研究，但仍然需要一个改进的脓毒症检测模型。我们比较了各种机器学习算法用于败血症检测，并使用了Physionet.org上所有研究人员公开的数据集，数据集包含许多空值或Null值，我们应用了向后填充和向前填充技术，并使用公式(1)计算MAP的缺失值，得到了更精确的结果，我们将数据集A和B的40,336个文件分为80%的训练集和20%的测试集。我们将算法应用了两次，一次是使用患者的生命体征和临床值，第二次是仅使用患者的生命体征;仅使用生命体征时，KNN、Logistic回归、随机森林、MLP和决策树的训练准确率分别为0.992、0.999、0.981、0.981和0.981，而KNN、Logistic回归、随机森林、MLP和决策树的测试准确率分别为0.987、0.980、0.983、0.981和0.981，对于败血症标签0,KNN、随机森林、决策树、Logistic回归和MLP的准确率分别为0.99、0.98、0.98、0.98和0.98。而KNN、随机森林、决策树、逻辑回归和MLP的召回率分别为1.00、1.00、1.00、1.00和1.00;上述算法的比较表明，KNN在准确率、精密度和召回率方面领先于所有竞争对手。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Vol 4 Issue 1

自引率

0.00%

发文量