Julian Andres Ramirez-Bautista, Silvia L. Chaparro-Cárdenas, Wilson Gamboa-Contreras, William Guerrero-Salazar, Jorge Adalberto Huerta-Ruelas
{"title":"Classification of COVID-19 associated symptomatology using machine learning","authors":"Julian Andres Ramirez-Bautista, Silvia L. Chaparro-Cárdenas, Wilson Gamboa-Contreras, William Guerrero-Salazar, Jorge Adalberto Huerta-Ruelas","doi":"10.15446/dyna.v90n226.105616","DOIUrl":null,"url":null,"abstract":"The health situation caused by the SARS-Cov2 coronavirus, posed major challenges for the scientific community. Advances in artificial intelligence are a very useful resource, but it is important to determine which symptoms presented by positive cases of infection are the best predictors. A machine learning approach was used with data from 5,434 people, with eleven symptoms: breathing problems, dry cough, sore throat, running nose, history of asthma, chronic lung, headache, heart disease, hypertension, diabetes, and fever. Based on public data from Kaggle with WHO standardized symptoms. A model was developed to detect COVID-19 positive cases using a simple machine learning model. The results of 4 loss functions and by SHAP values, were compared. The best loss function was Binary Cross Entropy, with a single hidden layer configuration with 10 neurons, achieving an F1 score of 0.98 and the model was rated with an area under the curve of 0.99 aucROC.","PeriodicalId":50565,"journal":{"name":"Dyna-Colombia","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Dyna-Colombia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15446/dyna.v90n226.105616","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Engineering","Score":null,"Total":0}
引用次数: 0
Abstract
The health situation caused by the SARS-Cov2 coronavirus, posed major challenges for the scientific community. Advances in artificial intelligence are a very useful resource, but it is important to determine which symptoms presented by positive cases of infection are the best predictors. A machine learning approach was used with data from 5,434 people, with eleven symptoms: breathing problems, dry cough, sore throat, running nose, history of asthma, chronic lung, headache, heart disease, hypertension, diabetes, and fever. Based on public data from Kaggle with WHO standardized symptoms. A model was developed to detect COVID-19 positive cases using a simple machine learning model. The results of 4 loss functions and by SHAP values, were compared. The best loss function was Binary Cross Entropy, with a single hidden layer configuration with 10 neurons, achieving an F1 score of 0.98 and the model was rated with an area under the curve of 0.99 aucROC.
期刊介绍:
The DYNA journal, consistent with the aim of disseminating research in engineering, covers all disciplines within the large area of Engineering and Technology (OECD), through research articles, case studies and review articles resulting from work of national and international researchers.