Remigio Ismael Hurtado Ortiz, Juan Carlos Barrera Barrera, Katherine Michelle Barrera Barrera
{"title":"Analysis model of the most important factors in Covid-19 through data mining, descriptive statistics and random forest","authors":"Remigio Ismael Hurtado Ortiz, Juan Carlos Barrera Barrera, Katherine Michelle Barrera Barrera","doi":"10.1109/ROPEC50909.2020.9258765","DOIUrl":null,"url":null,"abstract":"The Covid19 pandemic has had a great impact worldwide, it has become a major problem due to the demand for care in hospitals and clinics despite the low level of mortality. This is because the disease has spread rapidly as the spread between people is accelerated. So in this document we propose using a classification-oriented machine learning method, we do a classic data science process so that we can perform noise cleaning and data processing to do descriptive statistical analysis in such a way that the most important variables or factors are identified through unsupervised learning. And with this it is appreciated that the most important variables for the risk of infection and mortality that Covid-19 disease can have are diseases that affect the immune system, such as diabetes, heart disease, hypertension and also kidney disease. They can cause serious kidney problems. And the evaluation of our method will be carried out through quality measures. Finally, this work opens the door to other investigations with the aim of conducting centralized investigations on each variable related to Covid-19, in order to find relevant information that can promote an improvement in the current situation.","PeriodicalId":177447,"journal":{"name":"2020 IEEE International Autumn Meeting on Power, Electronics and Computing (ROPEC)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Autumn Meeting on Power, Electronics and Computing (ROPEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ROPEC50909.2020.9258765","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
The Covid19 pandemic has had a great impact worldwide, it has become a major problem due to the demand for care in hospitals and clinics despite the low level of mortality. This is because the disease has spread rapidly as the spread between people is accelerated. So in this document we propose using a classification-oriented machine learning method, we do a classic data science process so that we can perform noise cleaning and data processing to do descriptive statistical analysis in such a way that the most important variables or factors are identified through unsupervised learning. And with this it is appreciated that the most important variables for the risk of infection and mortality that Covid-19 disease can have are diseases that affect the immune system, such as diabetes, heart disease, hypertension and also kidney disease. They can cause serious kidney problems. And the evaluation of our method will be carried out through quality measures. Finally, this work opens the door to other investigations with the aim of conducting centralized investigations on each variable related to Covid-19, in order to find relevant information that can promote an improvement in the current situation.