基于临床数据的新冠肺炎自动诊断机器学习临床决策支持系统

Q4 Medicine

Journal of Biostatistics and Epidemiology Pub Date : 2022-08-29 DOI:10.18502/jbe.v8i1.10407

M. Afrash, L. Erfannia, Morteza Amrae, N. Mehrabi, Saeed Jelvay, Raoof Nopour, M. Shanbehzadeh

{"title":"基于临床数据的新冠肺炎自动诊断机器学习临床决策支持系统","authors":"M. Afrash, L. Erfannia, Morteza Amrae, N. Mehrabi, Saeed Jelvay, Raoof Nopour, M. Shanbehzadeh","doi":"10.18502/jbe.v8i1.10407","DOIUrl":null,"url":null,"abstract":"Introduction: Needless to say that correct and real-time detection and effective prognosis of the COVID-19 are necessary to deliver the best possible care for patients and, accordingly, diminish the pressure on the healthcare industries. Hence our paper aims to present an intelligent algorithm for selecting the best features from the dataset and developing Machine Learning(ML) based models to predict the COVID-19 and finally opted for the best-performing algorithm. \nMethods: In this developmental study, the clinical data of 1703 COVID-19 and non-COVID-19 patients Using a single-center registry from February 9, 2020, to December 20, 2020, were used. The Minimum Redundancy Maximum Relevance (mRMR) feature selection algorithm identified the most relevant variables. Then, chosen features feed into the several data mining methods, including K-Nearest Neighbors, AdaBoost Classifier, Decision Tree, HistGradient Boosting Classifier, and Support Vector Machine. A 10-fold cross-validation method and six performance evaluation metrics were used to evaluate and compare these implemented algorithms, and finally, the best model was implemented. \nResults: Out of the 34 included features, 11 variables were selected as the essential features. The results of using ML algorithms indicated that the best performance belongs to the AdaBoost classifier with mean accuracy = 92.9%, mean specificity = 89.3%, mean sensitivity = 94.2%, mean F-measure = 91.6 %, mean KAPA = 94.3% and mean ROC = 92.1 %. \nConclusion: The empirical results reveal that the Adaboost model yielded higher performance than other classification models and developed our Clinical Decision Support Systems (CDSS) interface to discriminate positive COVID-19 from negative cases.","PeriodicalId":34310,"journal":{"name":"Journal of Biostatistics and Epidemiology","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Machine Learning-Based Clinical Decision Support System for Automatic Diagnosis of COVID-19 based on Clinical Data\",\"authors\":\"M. Afrash, L. Erfannia, Morteza Amrae, N. Mehrabi, Saeed Jelvay, Raoof Nopour, M. Shanbehzadeh\",\"doi\":\"10.18502/jbe.v8i1.10407\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Introduction: Needless to say that correct and real-time detection and effective prognosis of the COVID-19 are necessary to deliver the best possible care for patients and, accordingly, diminish the pressure on the healthcare industries. Hence our paper aims to present an intelligent algorithm for selecting the best features from the dataset and developing Machine Learning(ML) based models to predict the COVID-19 and finally opted for the best-performing algorithm. \\nMethods: In this developmental study, the clinical data of 1703 COVID-19 and non-COVID-19 patients Using a single-center registry from February 9, 2020, to December 20, 2020, were used. The Minimum Redundancy Maximum Relevance (mRMR) feature selection algorithm identified the most relevant variables. Then, chosen features feed into the several data mining methods, including K-Nearest Neighbors, AdaBoost Classifier, Decision Tree, HistGradient Boosting Classifier, and Support Vector Machine. A 10-fold cross-validation method and six performance evaluation metrics were used to evaluate and compare these implemented algorithms, and finally, the best model was implemented. \\nResults: Out of the 34 included features, 11 variables were selected as the essential features. The results of using ML algorithms indicated that the best performance belongs to the AdaBoost classifier with mean accuracy = 92.9%, mean specificity = 89.3%, mean sensitivity = 94.2%, mean F-measure = 91.6 %, mean KAPA = 94.3% and mean ROC = 92.1 %. \\nConclusion: The empirical results reveal that the Adaboost model yielded higher performance than other classification models and developed our Clinical Decision Support Systems (CDSS) interface to discriminate positive COVID-19 from negative cases.\",\"PeriodicalId\":34310,\"journal\":{\"name\":\"Journal of Biostatistics and Epidemiology\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-08-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Biostatistics and Epidemiology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.18502/jbe.v8i1.10407\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"Medicine\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Biostatistics and Epidemiology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18502/jbe.v8i1.10407","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Medicine","Score":null,"Total":0}

引用次数: 2

摘要

简介：不用说，新冠肺炎的正确、实时检测和有效预后对于为患者提供尽可能好的护理是必要的，从而减轻医疗行业的压力。因此，我们的论文旨在提出一种智能算法，用于从数据集中选择最佳特征，并开发基于机器学习（ML）的模型来预测新冠肺炎，并最终选择性能最佳的算法。方法：在这项发展研究中，使用2020年2月9日至2020年12月20日的1703名新冠肺炎和非新冠肺炎患者的临床数据。最小冗余最大相关性（mRMR）特征选择算法确定了最相关的变量。然后，选择的特征输入到几种数据挖掘方法中，包括K-最近邻、AdaBoost分类器、决策树、HistGradient Boosting分类器和支持向量机。使用10倍交叉验证方法和6个性能评估指标来评估和比较这些实现的算法，最终实现了最佳模型。结果：在纳入的34个特征中，11个变量被选为基本特征。使用ML算法的结果表明，性能最好的是AdaBoost分类器，其平均准确率为92.9%，平均特异度为89.3%，平均灵敏度为94.2%，平均F-measure为91.6%，平均KAPA为94.3%，平均ROC为92.1%。结论：实证结果表明，Adaboost模型比其他分类模型产生了更高的性能，并开发了我们的临床决策支持系统（CDSS）接口来区分阳性新冠肺炎和阴性病例。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Machine Learning-Based Clinical Decision Support System for Automatic Diagnosis of COVID-19 based on Clinical Data

Introduction: Needless to say that correct and real-time detection and effective prognosis of the COVID-19 are necessary to deliver the best possible care for patients and, accordingly, diminish the pressure on the healthcare industries. Hence our paper aims to present an intelligent algorithm for selecting the best features from the dataset and developing Machine Learning(ML) based models to predict the COVID-19 and finally opted for the best-performing algorithm. Methods: In this developmental study, the clinical data of 1703 COVID-19 and non-COVID-19 patients Using a single-center registry from February 9, 2020, to December 20, 2020, were used. The Minimum Redundancy Maximum Relevance (mRMR) feature selection algorithm identified the most relevant variables. Then, chosen features feed into the several data mining methods, including K-Nearest Neighbors, AdaBoost Classifier, Decision Tree, HistGradient Boosting Classifier, and Support Vector Machine. A 10-fold cross-validation method and six performance evaluation metrics were used to evaluate and compare these implemented algorithms, and finally, the best model was implemented. Results: Out of the 34 included features, 11 variables were selected as the essential features. The results of using ML algorithms indicated that the best performance belongs to the AdaBoost classifier with mean accuracy = 92.9%, mean specificity = 89.3%, mean sensitivity = 94.2%, mean F-measure = 91.6 %, mean KAPA = 94.3% and mean ROC = 92.1 %. Conclusion: The empirical results reveal that the Adaboost model yielded higher performance than other classification models and developed our Clinical Decision Support Systems (CDSS) interface to discriminate positive COVID-19 from negative cases.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Biostatistics and Epidemiology Medicine-Epidemiology

CiteScore

0.80

自引率

0.00%

发文量

审稿时长

12 weeks