Characterization of SARS-CoV-2 cases in Mexico using data mining

Revista de Computo Aplicado Pub Date : 2020-12-31 DOI:10.35429/JCA.2020.15.4.19.25

Enrique Luna-Ramírez, Jorge Soria-Cruz, Apolinar Velarde-Martínez, E. Taya-Acosta

{"title":"Characterization of SARS-CoV-2 cases in Mexico using data mining","authors":"Enrique Luna-Ramírez, Jorge Soria-Cruz, Apolinar Velarde-Martínez, E. Taya-Acosta","doi":"10.35429/JCA.2020.15.4.19.25","DOIUrl":null,"url":null,"abstract":"In this paper, it is realized an analysis of the data published by the Federal Government of Mexico on the cases related to the test for detecting the presence of the SARS-CoV-2 virus, that originates the COVID-19 disease. More than a million cases were analyzed, most of which were positive to the test. For this study, twenty-one significant variables were considered, included the result of the test and the cases of death, going through the different factors that complicate a person’s health such as diabetes, chronic obstructive pulmonary disease (COPD), asthma, hypertension, obesity and smoking, among others. At the beginning of the study, the preparation of the data was carried out so that they could be treated using data mining techniques, based on the CRISP-DM methodology for extraction of knowledge. Thus, with the help of this type of techniques, data models were generated to characterize the development of the COVID-19 disease in the national and local (by States) panorama. As an important part of the models, various rules or correlations were observed among the different variables, which could be used to predict, in part, the future development of the COVID-19 disease in Mexico and, consequently, to establish best practices that target to reduce its social impact.","PeriodicalId":390253,"journal":{"name":"Revista de Computo Aplicado","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Revista de Computo Aplicado","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.35429/JCA.2020.15.4.19.25","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

In this paper, it is realized an analysis of the data published by the Federal Government of Mexico on the cases related to the test for detecting the presence of the SARS-CoV-2 virus, that originates the COVID-19 disease. More than a million cases were analyzed, most of which were positive to the test. For this study, twenty-one significant variables were considered, included the result of the test and the cases of death, going through the different factors that complicate a person’s health such as diabetes, chronic obstructive pulmonary disease (COPD), asthma, hypertension, obesity and smoking, among others. At the beginning of the study, the preparation of the data was carried out so that they could be treated using data mining techniques, based on the CRISP-DM methodology for extraction of knowledge. Thus, with the help of this type of techniques, data models were generated to characterize the development of the COVID-19 disease in the national and local (by States) panorama. As an important part of the models, various rules or correlations were observed among the different variables, which could be used to predict, in part, the future development of the COVID-19 disease in Mexico and, consequently, to establish best practices that target to reduce its social impact.

查看原文本刊更多论文

利用数据挖掘分析墨西哥SARS-CoV-2病例特征

本文对墨西哥联邦政府公布的与检测引发COVID-19疾病的SARS-CoV-2病毒存在相关的病例数据进行了分析。研究人员分析了100多万例病例，其中大多数检测呈阳性。在这项研究中，考虑了21个重要变量，包括测试结果和死亡病例，经历了使一个人的健康复杂化的不同因素，如糖尿病、慢性阻塞性肺病(COPD)、哮喘、高血压、肥胖和吸烟等。在研究开始时，进行了数据准备工作，以便使用基于CRISP-DM方法提取知识的数据挖掘技术对数据进行处理。因此，在这类技术的帮助下，生成了数据模型，以便在国家和地方(按国家)全景图中描述COVID-19疾病的发展情况。作为模型的重要组成部分，在不同变量之间观察到各种规则或相关性，这些规则或相关性可用于部分预测墨西哥COVID-19疾病的未来发展，从而建立旨在减少其社会影响的最佳做法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Revista de Computo Aplicado

自引率

0.00%

发文量