新冠肺炎:使用机器学习方法进行症状聚类和严重程度分类

IF 0.4 Q4 ENGINEERING, MULTIDISCIPLINARY
Nurul Fathia Mohamand Noor, Herold Sylvestro Sipail, N. Ahmad, Bayram Annanurov, N. Mohd Noor
{"title":"新冠肺炎:使用机器学习方法进行症状聚类和严重程度分类","authors":"Nurul Fathia Mohamand Noor, Herold Sylvestro Sipail, N. Ahmad, Bayram Annanurov, N. Mohd Noor","doi":"10.30880/ijie.2023.15.03.001","DOIUrl":null,"url":null,"abstract":"COVID-19 is an extremely contagious illness that causes illnesses varying from either the common cold to more chronic illnesses or even death. The constant mutation of a new variant of COVID-19 makes it important to identify the symptom of COVID-19 in order to contain the infection. The use of clustering and classification in machine learning is in mainstream use in different aspects of research, especially in recent years to generate useful knowledge on COVID-19outbreak. Many researchers have shared their COVID-19 data on public database and a lot of studies have been carried out. However, the meritof the dataset is unknown and analysis need to be carried by the researchers to check on its reliability. The dataset that is used in thisworkwas sourced from the Kaggle website. The data wasobtained through a survey collected from participants of various gender and age who had been to at least ten countries.There are four levels of severity based on the COVID-19 symptom, which was developed in accordance to World Health Organization (WHO)and the Indian Ministry of Health and Family Welfare recommendations. This paperpresented an inquiry on the dataset utilising supervised and unsupervised machine learning approaches in order to better comprehend the dataset.In this study, the analysisof the severity group based on theCOVID-19 symptomsusing supervised learning techniques employeda total of seven classifiers, namelythe K-NN, Linear SVM, Naive Bayes, Decision Tree (J48), Ada Boost, Bagging, and Stacking.For the unsupervised learning techniques, the clustering algorithm utilized in this work areSimple K-Means and Expectation-Maximization. From the result obtained from both supervised and unsupervised learning techniques, we observed that the result analysis yielded relatively poor classification and clustering results.The findings for the dataset analysed in this study donot appear to be providing the correctresult for the symptoms categorized against the severity levelwhich raises concerns about the validity and reliability of the dataset.","PeriodicalId":14189,"journal":{"name":"International Journal of Integrated Engineering","volume":" ","pages":""},"PeriodicalIF":0.4000,"publicationDate":"2023-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"COVID-19: Symptoms Clustering and Severity Classification Using Machine Learning Approach\",\"authors\":\"Nurul Fathia Mohamand Noor, Herold Sylvestro Sipail, N. Ahmad, Bayram Annanurov, N. Mohd Noor\",\"doi\":\"10.30880/ijie.2023.15.03.001\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"COVID-19 is an extremely contagious illness that causes illnesses varying from either the common cold to more chronic illnesses or even death. The constant mutation of a new variant of COVID-19 makes it important to identify the symptom of COVID-19 in order to contain the infection. The use of clustering and classification in machine learning is in mainstream use in different aspects of research, especially in recent years to generate useful knowledge on COVID-19outbreak. Many researchers have shared their COVID-19 data on public database and a lot of studies have been carried out. However, the meritof the dataset is unknown and analysis need to be carried by the researchers to check on its reliability. The dataset that is used in thisworkwas sourced from the Kaggle website. The data wasobtained through a survey collected from participants of various gender and age who had been to at least ten countries.There are four levels of severity based on the COVID-19 symptom, which was developed in accordance to World Health Organization (WHO)and the Indian Ministry of Health and Family Welfare recommendations. This paperpresented an inquiry on the dataset utilising supervised and unsupervised machine learning approaches in order to better comprehend the dataset.In this study, the analysisof the severity group based on theCOVID-19 symptomsusing supervised learning techniques employeda total of seven classifiers, namelythe K-NN, Linear SVM, Naive Bayes, Decision Tree (J48), Ada Boost, Bagging, and Stacking.For the unsupervised learning techniques, the clustering algorithm utilized in this work areSimple K-Means and Expectation-Maximization. From the result obtained from both supervised and unsupervised learning techniques, we observed that the result analysis yielded relatively poor classification and clustering results.The findings for the dataset analysed in this study donot appear to be providing the correctresult for the symptoms categorized against the severity levelwhich raises concerns about the validity and reliability of the dataset.\",\"PeriodicalId\":14189,\"journal\":{\"name\":\"International Journal of Integrated Engineering\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.4000,\"publicationDate\":\"2023-07-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Integrated Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.30880/ijie.2023.15.03.001\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"ENGINEERING, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Integrated Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.30880/ijie.2023.15.03.001","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

摘要

COVID-19是一种极具传染性的疾病,可导致各种疾病,从普通感冒到更慢性的疾病,甚至死亡。COVID-19新变体的不断突变使得识别COVID-19症状以控制感染变得非常重要。在机器学习中使用聚类和分类是不同研究方面的主流应用,特别是近年来用于生成有关covid -19疫情的有用知识。许多研究人员在公共数据库上分享了他们的新冠肺炎数据,并进行了大量研究。然而,数据集的价值是未知的,需要研究人员进行分析以检查其可靠性。本文中使用的数据集来自Kaggle网站。这些数据是通过一项调查获得的,调查对象来自不同性别和年龄的参与者,他们至少去过十个国家。根据世界卫生组织(WHO)和印度卫生和家庭福利部的建议,将新冠肺炎的症状分为4个严重程度。本文提出了利用监督和无监督机器学习方法对数据集进行查询,以便更好地理解数据集。在本研究中,基于covid -19症状的严重程度组分析使用了监督学习技术,共使用了7种分类器,即K-NN,线性支持向量机,朴素贝叶斯,决策树(J48), Ada Boost, Bagging和Stacking。对于无监督学习技术,本工作中使用的聚类算法是简单K-Means和期望最大化。从监督学习和非监督学习技术的结果来看,我们观察到结果分析产生了相对较差的分类和聚类结果。本研究分析的数据集的结果似乎没有为根据严重程度分类的症状提供正确的结果,这引起了对数据集有效性和可靠性的担忧。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
COVID-19: Symptoms Clustering and Severity Classification Using Machine Learning Approach
COVID-19 is an extremely contagious illness that causes illnesses varying from either the common cold to more chronic illnesses or even death. The constant mutation of a new variant of COVID-19 makes it important to identify the symptom of COVID-19 in order to contain the infection. The use of clustering and classification in machine learning is in mainstream use in different aspects of research, especially in recent years to generate useful knowledge on COVID-19outbreak. Many researchers have shared their COVID-19 data on public database and a lot of studies have been carried out. However, the meritof the dataset is unknown and analysis need to be carried by the researchers to check on its reliability. The dataset that is used in thisworkwas sourced from the Kaggle website. The data wasobtained through a survey collected from participants of various gender and age who had been to at least ten countries.There are four levels of severity based on the COVID-19 symptom, which was developed in accordance to World Health Organization (WHO)and the Indian Ministry of Health and Family Welfare recommendations. This paperpresented an inquiry on the dataset utilising supervised and unsupervised machine learning approaches in order to better comprehend the dataset.In this study, the analysisof the severity group based on theCOVID-19 symptomsusing supervised learning techniques employeda total of seven classifiers, namelythe K-NN, Linear SVM, Naive Bayes, Decision Tree (J48), Ada Boost, Bagging, and Stacking.For the unsupervised learning techniques, the clustering algorithm utilized in this work areSimple K-Means and Expectation-Maximization. From the result obtained from both supervised and unsupervised learning techniques, we observed that the result analysis yielded relatively poor classification and clustering results.The findings for the dataset analysed in this study donot appear to be providing the correctresult for the symptoms categorized against the severity levelwhich raises concerns about the validity and reliability of the dataset.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
International Journal of Integrated Engineering
International Journal of Integrated Engineering ENGINEERING, MULTIDISCIPLINARY-
CiteScore
1.40
自引率
0.00%
发文量
57
期刊介绍: The International Journal of Integrated Engineering (IJIE) is a single blind peer reviewed journal which publishes 3 times a year since 2009. The journal is dedicated to various issues focusing on 3 different fields which are:- Civil and Environmental Engineering. Original contributions for civil and environmental engineering related practices will be publishing under this category and as the nucleus of the journal contents. The journal publishes a wide range of research and application papers which describe laboratory and numerical investigations or report on full scale projects. Electrical and Electronic Engineering. It stands as a international medium for the publication of original papers concerned with the electrical and electronic engineering. The journal aims to present to the international community important results of work in this field, whether in the form of research, development, application or design. Mechanical, Materials and Manufacturing Engineering. It is a platform for the publication and dissemination of original work which contributes to the understanding of the main disciplines underpinning the mechanical, materials and manufacturing engineering. Original contributions giving insight into engineering practices related to mechanical, materials and manufacturing engineering form the core of the journal contents.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信