利用机器学习方法预测COVID-19死亡率

Computer systems and information technologies Pub Date : 2022-06-30 DOI:10.31891/csit-2022-2-12

A. Popovych, V. Yakovyna

{"title":"利用机器学习方法预测COVID-19死亡率","authors":"A. Popovych, V. Yakovyna","doi":"10.31891/csit-2022-2-12","DOIUrl":null,"url":null,"abstract":"The paper reports the use of machine learning methods for COVID-19 mortality prediction. An open dataset with large number of features and records was used for research. The goal of the research is to create the efficient model for mortality prediction which is based on large number of factors and enables the authorities to take actions to avoid mass spread of virus to and reduce the number of cases and deaths. Feature selection was conducted in order to remove potentially irrelevant input variables and improve performance of machine learning models. The classic machine learning models (both linear and non-linear), ensemble methods such as bagging, stacking and boosting, as well as neural networks, is used. Comparison of efficiency of ensemble methods and neural networks compared to classic ML methods such as linear regression, support vector machines, K nearest neighbors etc. is conducted. Ensemble methods and neural networks show much greater efficiency than classical ones. Feature selection does not significantly affect the prediction accuracy. \nThe scientific novelty of this paper is the large number of machine learning models trained on the large-scale dataset with significant number of features related to different factors that can potentially affect COVID-19 mortality, as well as further analysis of their efficiency. This will assist to select the most valuable features and to become a basis for creating a software designed for tracking the dynamics of the pandemic. \nThe practical significance of this paper is that present study can be useful for authorities and international organizations in prevention of COVID-19 mortality increase by taking proper preventive measures.","PeriodicalId":353631,"journal":{"name":"Computer systems and information technologies","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"COVID-19 MORTALITY PREDICTION USING MACHINE LEARNING METHODS\",\"authors\":\"A. Popovych, V. Yakovyna\",\"doi\":\"10.31891/csit-2022-2-12\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The paper reports the use of machine learning methods for COVID-19 mortality prediction. An open dataset with large number of features and records was used for research. The goal of the research is to create the efficient model for mortality prediction which is based on large number of factors and enables the authorities to take actions to avoid mass spread of virus to and reduce the number of cases and deaths. Feature selection was conducted in order to remove potentially irrelevant input variables and improve performance of machine learning models. The classic machine learning models (both linear and non-linear), ensemble methods such as bagging, stacking and boosting, as well as neural networks, is used. Comparison of efficiency of ensemble methods and neural networks compared to classic ML methods such as linear regression, support vector machines, K nearest neighbors etc. is conducted. Ensemble methods and neural networks show much greater efficiency than classical ones. Feature selection does not significantly affect the prediction accuracy. \\nThe scientific novelty of this paper is the large number of machine learning models trained on the large-scale dataset with significant number of features related to different factors that can potentially affect COVID-19 mortality, as well as further analysis of their efficiency. This will assist to select the most valuable features and to become a basis for creating a software designed for tracking the dynamics of the pandemic. \\nThe practical significance of this paper is that present study can be useful for authorities and international organizations in prevention of COVID-19 mortality increase by taking proper preventive measures.\",\"PeriodicalId\":353631,\"journal\":{\"name\":\"Computer systems and information technologies\",\"volume\":\"31 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-06-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer systems and information technologies\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.31891/csit-2022-2-12\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer systems and information technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.31891/csit-2022-2-12","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

这篇论文报道了机器学习方法在COVID-19死亡率预测中的应用。使用一个具有大量特征和记录的开放数据集进行研究。研究的目标是建立基于大量因素的有效死亡率预测模型，使当局能够采取行动，避免病毒的大规模传播，减少病例和死亡人数。进行特征选择是为了去除可能不相关的输入变量并提高机器学习模型的性能。使用经典的机器学习模型(线性和非线性)，集成方法，如bagging, stacking和boosting，以及神经网络。比较了集成方法和神经网络与经典ML方法(如线性回归、支持向量机、K近邻等)的效率。集成方法和神经网络的效率大大高于经典方法。特征选择对预测精度影响不显著。本文的科学新颖之处在于在大规模数据集上训练了大量机器学习模型，这些模型具有与可能影响COVID-19死亡率的不同因素相关的大量特征，并进一步分析了它们的效率。这将有助于选择最有价值的特征，并成为创建跟踪大流行病动态的软件的基础。本文的现实意义在于，本研究可以为当局和国际组织采取适当的预防措施来预防COVID-19死亡率的上升提供有益的帮助。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

COVID-19 MORTALITY PREDICTION USING MACHINE LEARNING METHODS

The paper reports the use of machine learning methods for COVID-19 mortality prediction. An open dataset with large number of features and records was used for research. The goal of the research is to create the efficient model for mortality prediction which is based on large number of factors and enables the authorities to take actions to avoid mass spread of virus to and reduce the number of cases and deaths. Feature selection was conducted in order to remove potentially irrelevant input variables and improve performance of machine learning models. The classic machine learning models (both linear and non-linear), ensemble methods such as bagging, stacking and boosting, as well as neural networks, is used. Comparison of efficiency of ensemble methods and neural networks compared to classic ML methods such as linear regression, support vector machines, K nearest neighbors etc. is conducted. Ensemble methods and neural networks show much greater efficiency than classical ones. Feature selection does not significantly affect the prediction accuracy. The scientific novelty of this paper is the large number of machine learning models trained on the large-scale dataset with significant number of features related to different factors that can potentially affect COVID-19 mortality, as well as further analysis of their efficiency. This will assist to select the most valuable features and to become a basis for creating a software designed for tracking the dynamics of the pandemic. The practical significance of this paper is that present study can be useful for authorities and international organizations in prevention of COVID-19 mortality increase by taking proper preventive measures.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computer systems and information technologies

自引率

0.00%

发文量