一种集成机器学习模型用于研究苏丹COVID-19潜在患者识别的筛选系统

Ali A. Adam, Nihad A. A. Elhag, Fakhreldeen Abbas Saeed, Mohamed Yagob, Fayha Mohamed, N. Eassa, Hanaa S. Abdalaziz, M. Ahmed, S. Babiker
{"title":"一种集成机器学习模型用于研究苏丹COVID-19潜在患者识别的筛选系统","authors":"Ali A. Adam, Nihad A. A. Elhag, Fakhreldeen Abbas Saeed, Mohamed Yagob, Fayha Mohamed, N. Eassa, Hanaa S. Abdalaziz, M. Ahmed, S. Babiker","doi":"10.1109/ICOTEN52080.2021.9493517","DOIUrl":null,"url":null,"abstract":"This study aims at designing an ensemble Machine Learning Model to serve as a screening system to predict the potential of COVID-19 infection. according to specific parameters, it considers an online survey filled by 5966 participants from Khartoum City since Khartoum was under quarantine. Major statistical approaches were implemented as data cleaning, performing feature selection using Random Forest algorithm to elect the proper features, and finally, building the model on two parts: the first one used K-mode clustering algorithm whereas the second utilized Support Vector Classifier (SVC). The features included symptoms, age, underlying conditions, geographical location, the period of the symptoms, close contact with someone who has confirmed a case of coronavirus, and the number of deaths among the family members. The results indicated that the overall accuracy of the K-mode Part was 71 %; however, the sensitivity to predict cases as negative was 77%, while the accuracy of SVC Part was 76 %. The identity between predictions of the two Parts was 79%. The work concluded that the symptoms in the proposed Screen system – considering the highest weight- appeared as following: Fatigue, Headache, Fever, Gastrointestinal Disorders, Anosmia, Dry Cough, Short of Breath, and Chest Pain, respectively","PeriodicalId":308802,"journal":{"name":"2021 International Congress of Advanced Technology and Engineering (ICOTEN)","volume":"84 4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An Ensemble Machine Learning Model to Investigate the Screening System for Identification of Potential Patients with COVID-19 in Sudan\",\"authors\":\"Ali A. Adam, Nihad A. A. Elhag, Fakhreldeen Abbas Saeed, Mohamed Yagob, Fayha Mohamed, N. Eassa, Hanaa S. Abdalaziz, M. Ahmed, S. Babiker\",\"doi\":\"10.1109/ICOTEN52080.2021.9493517\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This study aims at designing an ensemble Machine Learning Model to serve as a screening system to predict the potential of COVID-19 infection. according to specific parameters, it considers an online survey filled by 5966 participants from Khartoum City since Khartoum was under quarantine. Major statistical approaches were implemented as data cleaning, performing feature selection using Random Forest algorithm to elect the proper features, and finally, building the model on two parts: the first one used K-mode clustering algorithm whereas the second utilized Support Vector Classifier (SVC). The features included symptoms, age, underlying conditions, geographical location, the period of the symptoms, close contact with someone who has confirmed a case of coronavirus, and the number of deaths among the family members. The results indicated that the overall accuracy of the K-mode Part was 71 %; however, the sensitivity to predict cases as negative was 77%, while the accuracy of SVC Part was 76 %. The identity between predictions of the two Parts was 79%. The work concluded that the symptoms in the proposed Screen system – considering the highest weight- appeared as following: Fatigue, Headache, Fever, Gastrointestinal Disorders, Anosmia, Dry Cough, Short of Breath, and Chest Pain, respectively\",\"PeriodicalId\":308802,\"journal\":{\"name\":\"2021 International Congress of Advanced Technology and Engineering (ICOTEN)\",\"volume\":\"84 4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-07-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 International Congress of Advanced Technology and Engineering (ICOTEN)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICOTEN52080.2021.9493517\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Congress of Advanced Technology and Engineering (ICOTEN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICOTEN52080.2021.9493517","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

本研究旨在设计一个集成机器学习模型作为筛选系统来预测COVID-19感染的可能性。根据具体参数,它考虑喀土穆市自喀土穆被隔离以来5966名参与者填写的在线调查。主要的统计方法包括数据清理,使用随机森林算法进行特征选择,选择合适的特征,最后分两部分构建模型:第一部分使用k模式聚类算法,第二部分使用支持向量分类器(SVC)。这些特征包括症状、年龄、潜在疾病、地理位置、症状持续时间、与确诊冠状病毒病例的密切接触以及家庭成员中的死亡人数。结果表明,k型零件的整体精度为71%;然而,预测阴性病例的敏感性为77%,而SVC部分的准确性为76%。两部分预测的一致性为79%。这项工作的结论是,在建议的筛选系统中,考虑到最高的体重,症状分别表现为:疲劳、头痛、发烧、胃肠道紊乱、嗅觉丧失、干咳、呼吸短促和胸痛
本文章由计算机程序翻译,如有差异,请以英文原文为准。
An Ensemble Machine Learning Model to Investigate the Screening System for Identification of Potential Patients with COVID-19 in Sudan
This study aims at designing an ensemble Machine Learning Model to serve as a screening system to predict the potential of COVID-19 infection. according to specific parameters, it considers an online survey filled by 5966 participants from Khartoum City since Khartoum was under quarantine. Major statistical approaches were implemented as data cleaning, performing feature selection using Random Forest algorithm to elect the proper features, and finally, building the model on two parts: the first one used K-mode clustering algorithm whereas the second utilized Support Vector Classifier (SVC). The features included symptoms, age, underlying conditions, geographical location, the period of the symptoms, close contact with someone who has confirmed a case of coronavirus, and the number of deaths among the family members. The results indicated that the overall accuracy of the K-mode Part was 71 %; however, the sensitivity to predict cases as negative was 77%, while the accuracy of SVC Part was 76 %. The identity between predictions of the two Parts was 79%. The work concluded that the symptoms in the proposed Screen system – considering the highest weight- appeared as following: Fatigue, Headache, Fever, Gastrointestinal Disorders, Anosmia, Dry Cough, Short of Breath, and Chest Pain, respectively
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信