Ali A. Adam, Nihad A. A. Elhag, Fakhreldeen Abbas Saeed, Mohamed Yagob, Fayha Mohamed, N. Eassa, Hanaa S. Abdalaziz, M. Ahmed, S. Babiker
{"title":"一种集成机器学习模型用于研究苏丹COVID-19潜在患者识别的筛选系统","authors":"Ali A. Adam, Nihad A. A. Elhag, Fakhreldeen Abbas Saeed, Mohamed Yagob, Fayha Mohamed, N. Eassa, Hanaa S. Abdalaziz, M. Ahmed, S. Babiker","doi":"10.1109/ICOTEN52080.2021.9493517","DOIUrl":null,"url":null,"abstract":"This study aims at designing an ensemble Machine Learning Model to serve as a screening system to predict the potential of COVID-19 infection. according to specific parameters, it considers an online survey filled by 5966 participants from Khartoum City since Khartoum was under quarantine. Major statistical approaches were implemented as data cleaning, performing feature selection using Random Forest algorithm to elect the proper features, and finally, building the model on two parts: the first one used K-mode clustering algorithm whereas the second utilized Support Vector Classifier (SVC). The features included symptoms, age, underlying conditions, geographical location, the period of the symptoms, close contact with someone who has confirmed a case of coronavirus, and the number of deaths among the family members. The results indicated that the overall accuracy of the K-mode Part was 71 %; however, the sensitivity to predict cases as negative was 77%, while the accuracy of SVC Part was 76 %. The identity between predictions of the two Parts was 79%. The work concluded that the symptoms in the proposed Screen system – considering the highest weight- appeared as following: Fatigue, Headache, Fever, Gastrointestinal Disorders, Anosmia, Dry Cough, Short of Breath, and Chest Pain, respectively","PeriodicalId":308802,"journal":{"name":"2021 International Congress of Advanced Technology and Engineering (ICOTEN)","volume":"84 4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An Ensemble Machine Learning Model to Investigate the Screening System for Identification of Potential Patients with COVID-19 in Sudan\",\"authors\":\"Ali A. Adam, Nihad A. A. Elhag, Fakhreldeen Abbas Saeed, Mohamed Yagob, Fayha Mohamed, N. Eassa, Hanaa S. Abdalaziz, M. Ahmed, S. Babiker\",\"doi\":\"10.1109/ICOTEN52080.2021.9493517\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This study aims at designing an ensemble Machine Learning Model to serve as a screening system to predict the potential of COVID-19 infection. according to specific parameters, it considers an online survey filled by 5966 participants from Khartoum City since Khartoum was under quarantine. Major statistical approaches were implemented as data cleaning, performing feature selection using Random Forest algorithm to elect the proper features, and finally, building the model on two parts: the first one used K-mode clustering algorithm whereas the second utilized Support Vector Classifier (SVC). The features included symptoms, age, underlying conditions, geographical location, the period of the symptoms, close contact with someone who has confirmed a case of coronavirus, and the number of deaths among the family members. The results indicated that the overall accuracy of the K-mode Part was 71 %; however, the sensitivity to predict cases as negative was 77%, while the accuracy of SVC Part was 76 %. The identity between predictions of the two Parts was 79%. The work concluded that the symptoms in the proposed Screen system – considering the highest weight- appeared as following: Fatigue, Headache, Fever, Gastrointestinal Disorders, Anosmia, Dry Cough, Short of Breath, and Chest Pain, respectively\",\"PeriodicalId\":308802,\"journal\":{\"name\":\"2021 International Congress of Advanced Technology and Engineering (ICOTEN)\",\"volume\":\"84 4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-07-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 International Congress of Advanced Technology and Engineering (ICOTEN)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICOTEN52080.2021.9493517\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Congress of Advanced Technology and Engineering (ICOTEN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICOTEN52080.2021.9493517","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An Ensemble Machine Learning Model to Investigate the Screening System for Identification of Potential Patients with COVID-19 in Sudan
This study aims at designing an ensemble Machine Learning Model to serve as a screening system to predict the potential of COVID-19 infection. according to specific parameters, it considers an online survey filled by 5966 participants from Khartoum City since Khartoum was under quarantine. Major statistical approaches were implemented as data cleaning, performing feature selection using Random Forest algorithm to elect the proper features, and finally, building the model on two parts: the first one used K-mode clustering algorithm whereas the second utilized Support Vector Classifier (SVC). The features included symptoms, age, underlying conditions, geographical location, the period of the symptoms, close contact with someone who has confirmed a case of coronavirus, and the number of deaths among the family members. The results indicated that the overall accuracy of the K-mode Part was 71 %; however, the sensitivity to predict cases as negative was 77%, while the accuracy of SVC Part was 76 %. The identity between predictions of the two Parts was 79%. The work concluded that the symptoms in the proposed Screen system – considering the highest weight- appeared as following: Fatigue, Headache, Fever, Gastrointestinal Disorders, Anosmia, Dry Cough, Short of Breath, and Chest Pain, respectively