基于监督机器学习模型的重症和轻度感染COVID-19患者自动检测

IF 5.6 4区医学 Q1 ENGINEERING, BIOMEDICAL

Irbm Pub Date : 2023-02-01 DOI:10.1016/j.irbm.2022.05.006

M.T. Huyut

{"title":"基于监督机器学习模型的重症和轻度感染COVID-19患者自动检测","authors":"M.T. Huyut","doi":"10.1016/j.irbm.2022.05.006","DOIUrl":null,"url":null,"abstract":"<div><h3>Objectives</h3><p>When the prognosis of COVID-19 disease can be detected early, the intense-pressure and loss of workforce in health-services can be partially reduced. The primary-purpose of this article is to determine the feature-dataset consisting of the routine-blood-values (RBV) and demographic-data that affect the prognosis of COVID-19. Second, by applying the feature-dataset to the supervised machine-learning (ML) models, it is to identify severely and mildly infected COVID-19 patients at the time of admission.</p></div><div><h3>Material and methods</h3><p>The sample of this study consists of severely (n = 192) and mildly (n = 4010) infected-patients hospitalized with the diagnosis of COVID-19 between March-September, 2021. The RBV-data measured at the time of admission and age-gender characteristics of these patients were analyzed retrospectively. For the selection of the features, the minimum-redundancy-maximum-relevance (MRMR) method, principal-components-analysis and forward-multiple-logistics-regression analyzes were used. The features set were statistically compared between mild and severe infected-patients. Then, the performances of various supervised-ML-models were compared in identifying severely and mildly infected-patients using the feature set.</p></div><div><h3>Results</h3><p>In this study, 28 RBV-parameters and age-variable were found as the feature-dataset. The effect of features on the prognosis of the disease has been clinically proven. The ML-models with the highest overall-accuracy in identifying patient-groups were found respectively, as follows: local-weighted-learning (LWL)-97.86%, K-star (K*)-96.31%, Naive-Bayes (NB)-95.36% and k-nearest-neighbor (KNN)-94.05%. Also, the most successful models with the highest area-under-the-receiver-operating-characteristic-curve (AUC) values in identifying patient groups were found respectively, as follows: LWL-0.95%, K*-0.91%, NB-0.85% and KNN-0.75%.</p></div><div><h3>Conclusion</h3><p>The findings in this article have significant a motivation for the healthcare professionals to detect at admission severely and mildly infected COVID-19 patients.</p></div>","PeriodicalId":14605,"journal":{"name":"Irbm","volume":"44 1","pages":"Article 100725"},"PeriodicalIF":5.6000,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9158375/pdf/","citationCount":"20","resultStr":"{\"title\":\"Automatic Detection of Severely and Mildly Infected COVID-19 Patients with Supervised Machine Learning Models\",\"authors\":\"M.T. Huyut\",\"doi\":\"10.1016/j.irbm.2022.05.006\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Objectives</h3><p>When the prognosis of COVID-19 disease can be detected early, the intense-pressure and loss of workforce in health-services can be partially reduced. The primary-purpose of this article is to determine the feature-dataset consisting of the routine-blood-values (RBV) and demographic-data that affect the prognosis of COVID-19. Second, by applying the feature-dataset to the supervised machine-learning (ML) models, it is to identify severely and mildly infected COVID-19 patients at the time of admission.</p></div><div><h3>Material and methods</h3><p>The sample of this study consists of severely (n = 192) and mildly (n = 4010) infected-patients hospitalized with the diagnosis of COVID-19 between March-September, 2021. The RBV-data measured at the time of admission and age-gender characteristics of these patients were analyzed retrospectively. For the selection of the features, the minimum-redundancy-maximum-relevance (MRMR) method, principal-components-analysis and forward-multiple-logistics-regression analyzes were used. The features set were statistically compared between mild and severe infected-patients. Then, the performances of various supervised-ML-models were compared in identifying severely and mildly infected-patients using the feature set.</p></div><div><h3>Results</h3><p>In this study, 28 RBV-parameters and age-variable were found as the feature-dataset. The effect of features on the prognosis of the disease has been clinically proven. The ML-models with the highest overall-accuracy in identifying patient-groups were found respectively, as follows: local-weighted-learning (LWL)-97.86%, K-star (K*)-96.31%, Naive-Bayes (NB)-95.36% and k-nearest-neighbor (KNN)-94.05%. Also, the most successful models with the highest area-under-the-receiver-operating-characteristic-curve (AUC) values in identifying patient groups were found respectively, as follows: LWL-0.95%, K*-0.91%, NB-0.85% and KNN-0.75%.</p></div><div><h3>Conclusion</h3><p>The findings in this article have significant a motivation for the healthcare professionals to detect at admission severely and mildly infected COVID-19 patients.</p></div>\",\"PeriodicalId\":14605,\"journal\":{\"name\":\"Irbm\",\"volume\":\"44 1\",\"pages\":\"Article 100725\"},\"PeriodicalIF\":5.6000,\"publicationDate\":\"2023-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9158375/pdf/\",\"citationCount\":\"20\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Irbm\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1959031822000598\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, BIOMEDICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Irbm","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1959031822000598","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}

引用次数: 20

摘要

目的当新冠肺炎疾病的预后能够早期发现时，可以部分减轻卫生服务人员的压力和损失。本文的主要目的是确定由影响新冠肺炎预后的常规血液值（RBV）和人口学数据组成的特征数据集。其次，通过将特征数据集应用于监督机器学习（ML）模型，可以识别入院时严重和轻度感染的新冠肺炎患者。材料和方法本研究的样本包括2021年3月至9月期间因诊断为新冠肺炎而住院的重度（n=192）和轻度（n=4010）感染患者。对这些患者入院时测量的RBV数据和年龄性别特征进行回顾性分析。在特征的选择上，采用了最小冗余最大相关性（MRMR）方法、主成分分析和前向多元物流回归分析。对轻度和重度感染患者的特征集进行统计学比较。然后，使用特征集比较各种监督ML模型在识别严重和轻度感染患者方面的性能。结果本研究共发现28个RBV参数和年龄变量作为特征数据集。特征对疾病预后的影响已得到临床证实。发现在识别患者组方面总体准确率最高的ML模型分别为：局部加权学习（LWL）-97.86%、K-star（K*）-96.31%、Naive Bayes（NB）-95.36%和K-nearest-neighbor（KNN）-94.05%，在识别患者组时，发现了面积-受试者-手术特征曲线（AUC）值最高的最成功模型，分别为LWL-0.95%、K*-0.91%、NB-0.85%和KNN-0.75%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Automatic Detection of Severely and Mildly Infected COVID-19 Patients with Supervised Machine Learning Models

查看原文本刊更多论文

Automatic Detection of Severely and Mildly Infected COVID-19 Patients with Supervised Machine Learning Models

Objectives

When the prognosis of COVID-19 disease can be detected early, the intense-pressure and loss of workforce in health-services can be partially reduced. The primary-purpose of this article is to determine the feature-dataset consisting of the routine-blood-values (RBV) and demographic-data that affect the prognosis of COVID-19. Second, by applying the feature-dataset to the supervised machine-learning (ML) models, it is to identify severely and mildly infected COVID-19 patients at the time of admission.

Material and methods

The sample of this study consists of severely (n = 192) and mildly (n = 4010) infected-patients hospitalized with the diagnosis of COVID-19 between March-September, 2021. The RBV-data measured at the time of admission and age-gender characteristics of these patients were analyzed retrospectively. For the selection of the features, the minimum-redundancy-maximum-relevance (MRMR) method, principal-components-analysis and forward-multiple-logistics-regression analyzes were used. The features set were statistically compared between mild and severe infected-patients. Then, the performances of various supervised-ML-models were compared in identifying severely and mildly infected-patients using the feature set.

Results

In this study, 28 RBV-parameters and age-variable were found as the feature-dataset. The effect of features on the prognosis of the disease has been clinically proven. The ML-models with the highest overall-accuracy in identifying patient-groups were found respectively, as follows: local-weighted-learning (LWL)-97.86%, K-star (K*)-96.31%, Naive-Bayes (NB)-95.36% and k-nearest-neighbor (KNN)-94.05%. Also, the most successful models with the highest area-under-the-receiver-operating-characteristic-curve (AUC) values in identifying patient groups were found respectively, as follows: LWL-0.95%, K*-0.91%, NB-0.85% and KNN-0.75%.

Conclusion

The findings in this article have significant a motivation for the healthcare professionals to detect at admission severely and mildly infected COVID-19 patients.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Irbm ENGINEERING, BIOMEDICAL-

CiteScore

10.30

自引率

4.20%

发文量

审稿时长

57 days

期刊介绍： IRBM is the journal of the AGBM (Alliance for engineering in Biology an Medicine / Alliance pour le génie biologique et médical) and the SFGBM (BioMedical Engineering French Society / Société française de génie biologique médical) and the AFIB (French Association of Biomedical Engineers / Association française des ingénieurs biomédicaux). As a vehicle of information and knowledge in the field of biomedical technologies, IRBM is devoted to fundamental as well as clinical research. Biomedical engineering and use of new technologies are the cornerstones of IRBM, providing authors and users with the latest information. Its six issues per year propose reviews (state-of-the-art and current knowledge), original articles directed at fundamental research and articles focusing on biomedical engineering. All articles are submitted to peer reviewers acting as guarantors for IRBM''s scientific and medical content. The field covered by IRBM includes all the discipline of Biomedical engineering. Thereby, the type of papers published include those that cover the technological and methodological development in: -Physiological and Biological Signal processing (EEG, MEG, ECG…)- Medical Image processing- Biomechanics- Biomaterials- Medical Physics- Biophysics- Physiological and Biological Sensors- Information technologies in healthcare- Disability research- Computational physiology- …