Gustavo Carreiro Pinasco, Eduardo Moreno Júdice de Mattos Farina, Fabiano Novaes Barcellos Filho, Willer França Fiorotti, Matheus Coradini Mariano Ferreira, Sheila Cristina de Souza Cruz, Andre Louzada Colodette, Luciene Rossati Loureiro, Tatiane Comerio, Dilzilene Cunha Sivirino Farias, Katia Valéria Manhambusque, E. de Fátima Almeida Lima
{"title":"An interpretable machine learning model for covid-19 screening","authors":"Gustavo Carreiro Pinasco, Eduardo Moreno Júdice de Mattos Farina, Fabiano Novaes Barcellos Filho, Willer França Fiorotti, Matheus Coradini Mariano Ferreira, Sheila Cristina de Souza Cruz, Andre Louzada Colodette, Luciene Rossati Loureiro, Tatiane Comerio, Dilzilene Cunha Sivirino Farias, Katia Valéria Manhambusque, E. de Fátima Almeida Lima","doi":"10.36311/jhgd.v32.13324","DOIUrl":null,"url":null,"abstract":"Introduction: the Coronavirus Disease 2019 (COVID-19) is a viral disease which has been declared a pandemic by the WHO. Diagnostic tests are expensive and are not always available. Researches using machine learning (ML) approach for diagnosing SARS-CoV-2 infection have been proposed in the literature to reduce cost and allow better control of the pandemic.\nObjective: we aim to develop a machine learning model to predict if a patient has COVID-19 with epidemiological data and clinical features.\nMethods: we used six ML algorithms for COVID-19 screening through diagnostic prediction and did an interpretative analysis using SHAP models and feature importances.\nResults: our best model was XGBoost (XGB) which obtained an area under the ROC curve of 0.752, a sensitivity of 90%, a specificity of 40%, a positive predictive value (PPV) of 42.16%, and a negative predictive value (NPV) of 91.0%. The best predictors were fever, cough, history of international travel less than 14 days ago, male gender, and nasal congestion, respectively.\nConclusion: We conclude that ML is an important tool for screening with high sensitivity, compared to rapid tests, and can be used to empower clinical precision in COVID-19, a disease in which symptoms are very unspecific.\n ","PeriodicalId":35218,"journal":{"name":"Journal of Human Growth and Development","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Human Growth and Development","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.36311/jhgd.v32.13324","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Medicine","Score":null,"Total":0}
引用次数: 0
Abstract
Introduction: the Coronavirus Disease 2019 (COVID-19) is a viral disease which has been declared a pandemic by the WHO. Diagnostic tests are expensive and are not always available. Researches using machine learning (ML) approach for diagnosing SARS-CoV-2 infection have been proposed in the literature to reduce cost and allow better control of the pandemic.
Objective: we aim to develop a machine learning model to predict if a patient has COVID-19 with epidemiological data and clinical features.
Methods: we used six ML algorithms for COVID-19 screening through diagnostic prediction and did an interpretative analysis using SHAP models and feature importances.
Results: our best model was XGBoost (XGB) which obtained an area under the ROC curve of 0.752, a sensitivity of 90%, a specificity of 40%, a positive predictive value (PPV) of 42.16%, and a negative predictive value (NPV) of 91.0%. The best predictors were fever, cough, history of international travel less than 14 days ago, male gender, and nasal congestion, respectively.
Conclusion: We conclude that ML is an important tool for screening with high sensitivity, compared to rapid tests, and can be used to empower clinical precision in COVID-19, a disease in which symptoms are very unspecific.