基于合成少数群体超采样技术的高效集合学习模型检测 COVID-19 严重性

Bulletin of Electrical Engineering and Informatics Pub Date : 2024-06-01 DOI:10.11591/eei.v13i3.6774

Smriti Mishra, Ranjan Kumar, S. K. Tiwari, Priya Ranjan

{"title":"基于合成少数群体超采样技术的高效集合学习模型检测 COVID-19 严重性","authors":"Smriti Mishra, Ranjan Kumar, S. K. Tiwari, Priya Ranjan","doi":"10.11591/eei.v13i3.6774","DOIUrl":null,"url":null,"abstract":"The COVID-19 pandemic has highlighted the importance of accurately predicting disease severity to ensure timely intervention and effective allocation of healthcare resources, which can ultimately improve patient outcomes. This study aims to develop an efficient machine learning (ML) model based on patient demographic and clinical data. It utilizes advanced feature engineering techniques to reduce the dimensionality of dataset and address the issue of highly imbalanced data using synthetic minority oversampling technique (SMOTE). The study employs several ensemble learning models, including XGBoost, Random Forest, AdaBoost, voting ensemble, enhanced-weighted voting ensemble, and stack-based ensembles with support vector machine (SVM) and Gaussian Naïve Bayes as meta-learners, to develop the proposed model. The results indicate that the proposed model outperformed the top-performing models reported in previous studies. It achieved an accuracy of 0.978, sensitivity of 1.0, precision of 0.875, F1-score of 0.934, and receiver operating characteristic area under the curve (ROC-AUC) of 0.965. The study identified several features that significantly correlated with COVID-19 severity, which included respiratory rate (breaths per minute), c-reactive proteins, age, and total leukocyte count (TLC) count. The proposed approach presents a promising method for accurate COVID-19 severity prediction, which may prove valuable in assisting healthcare providers in making informed decisions about patient care.","PeriodicalId":502860,"journal":{"name":"Bulletin of Electrical Engineering and Informatics","volume":"10 12","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An efficient synthetic minority oversampling technique-based ensemble learning model to detect COVID-19 severity\",\"authors\":\"Smriti Mishra, Ranjan Kumar, S. K. Tiwari, Priya Ranjan\",\"doi\":\"10.11591/eei.v13i3.6774\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The COVID-19 pandemic has highlighted the importance of accurately predicting disease severity to ensure timely intervention and effective allocation of healthcare resources, which can ultimately improve patient outcomes. This study aims to develop an efficient machine learning (ML) model based on patient demographic and clinical data. It utilizes advanced feature engineering techniques to reduce the dimensionality of dataset and address the issue of highly imbalanced data using synthetic minority oversampling technique (SMOTE). The study employs several ensemble learning models, including XGBoost, Random Forest, AdaBoost, voting ensemble, enhanced-weighted voting ensemble, and stack-based ensembles with support vector machine (SVM) and Gaussian Naïve Bayes as meta-learners, to develop the proposed model. The results indicate that the proposed model outperformed the top-performing models reported in previous studies. It achieved an accuracy of 0.978, sensitivity of 1.0, precision of 0.875, F1-score of 0.934, and receiver operating characteristic area under the curve (ROC-AUC) of 0.965. The study identified several features that significantly correlated with COVID-19 severity, which included respiratory rate (breaths per minute), c-reactive proteins, age, and total leukocyte count (TLC) count. The proposed approach presents a promising method for accurate COVID-19 severity prediction, which may prove valuable in assisting healthcare providers in making informed decisions about patient care.\",\"PeriodicalId\":502860,\"journal\":{\"name\":\"Bulletin of Electrical Engineering and Informatics\",\"volume\":\"10 12\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Bulletin of Electrical Engineering and Informatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.11591/eei.v13i3.6774\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bulletin of Electrical Engineering and Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.11591/eei.v13i3.6774","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

COVID-19 大流行凸显了准确预测疾病严重程度的重要性，以确保及时干预和有效分配医疗资源，最终改善患者预后。本研究旨在开发一种基于患者人口统计学和临床数据的高效机器学习（ML）模型。它利用先进的特征工程技术来降低数据集的维度，并使用合成少数群体超采样技术（SMOTE）来解决高度不平衡数据的问题。研究采用了多种集合学习模型，包括 XGBoost、随机森林、AdaBoost、投票集合、增强加权投票集合，以及以支持向量机（SVM）和高斯奈夫贝叶斯为元学习器的基于堆栈的集合，来开发所提出的模型。结果表明，所提出的模型优于以往研究报告中表现最好的模型。它的准确度达到了 0.978，灵敏度达到了 1.0，精确度达到了 0.875，F1 分数达到了 0.934，曲线下接收器操作特征面积（ROC-AUC）达到了 0.965。研究发现了与 COVID-19 严重程度明显相关的几个特征，包括呼吸频率（每分钟呼吸次数）、c 反应蛋白、年龄和白细胞总数 (TLC) 计数。所提出的方法为准确预测 COVID-19 的严重程度提供了一种很有前景的方法，它在协助医疗服务提供者就患者护理做出明智决策方面可能会很有价值。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

An efficient synthetic minority oversampling technique-based ensemble learning model to detect COVID-19 severity

The COVID-19 pandemic has highlighted the importance of accurately predicting disease severity to ensure timely intervention and effective allocation of healthcare resources, which can ultimately improve patient outcomes. This study aims to develop an efficient machine learning (ML) model based on patient demographic and clinical data. It utilizes advanced feature engineering techniques to reduce the dimensionality of dataset and address the issue of highly imbalanced data using synthetic minority oversampling technique (SMOTE). The study employs several ensemble learning models, including XGBoost, Random Forest, AdaBoost, voting ensemble, enhanced-weighted voting ensemble, and stack-based ensembles with support vector machine (SVM) and Gaussian Naïve Bayes as meta-learners, to develop the proposed model. The results indicate that the proposed model outperformed the top-performing models reported in previous studies. It achieved an accuracy of 0.978, sensitivity of 1.0, precision of 0.875, F1-score of 0.934, and receiver operating characteristic area under the curve (ROC-AUC) of 0.965. The study identified several features that significantly correlated with COVID-19 severity, which included respiratory rate (breaths per minute), c-reactive proteins, age, and total leukocyte count (TLC) count. The proposed approach presents a promising method for accurate COVID-19 severity prediction, which may prove valuable in assisting healthcare providers in making informed decisions about patient care.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Bulletin of Electrical Engineering and Informatics

自引率

0.00%

发文量