Tianlong Zheng;Chunli Zhang;Yueting Shi;Debao Chen;Sheng Liu
{"title":"新冠肺炎疫情影响因素及聚集特征的全球分析","authors":"Tianlong Zheng;Chunli Zhang;Yueting Shi;Debao Chen;Sheng Liu","doi":"10.26599/BDMA.2022.9020010","DOIUrl":null,"url":null,"abstract":"The unprecedented coronavirus disease 2019 (COVID-19) pandemic is still raging (in year 2021) in many countries worldwide. Various response strategies to study the characteristics and distributions of the virus in various regions of the world have been developed to assist in the prevention and control of this epidemic. Descriptive statistics and regression analysis on COVID-19 data from different countries were conducted in this study to compare and evaluate various regression models. Results showed that the extreme random forest regression (ERFR) model had the best performance, and factors such as population density, ozone, median age, life expectancy, and Human Development Index (HDI) were relatively influential on the spread and diffusion of COVID-19 in the ERFR model. In addition, the epidemic clustering characteristics were analyzed through the spectral clustering algorithm. The visualization results of spectral clustering showed that the geographical distribution of global COVID-19 pandemic spread formation was highly clustered, and its clustering characteristics and influencing factors also exhibited some consistency in distribution. This study aims to deepen the understanding of the international community regarding the global COVID-19 pandemic to develop measures for countries worldwide to mitigate potential large-scale outbreaks and improve the ability to respond to such public health emergencies.","PeriodicalId":52355,"journal":{"name":"Big Data Mining and Analytics","volume":"5 4","pages":"318-338"},"PeriodicalIF":7.7000,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8254253/9832761/09832767.pdf","citationCount":"2","resultStr":"{\"title\":\"Influencing Factors and Clustering Characteristics of COVID-19: A Global Analysis\",\"authors\":\"Tianlong Zheng;Chunli Zhang;Yueting Shi;Debao Chen;Sheng Liu\",\"doi\":\"10.26599/BDMA.2022.9020010\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The unprecedented coronavirus disease 2019 (COVID-19) pandemic is still raging (in year 2021) in many countries worldwide. Various response strategies to study the characteristics and distributions of the virus in various regions of the world have been developed to assist in the prevention and control of this epidemic. Descriptive statistics and regression analysis on COVID-19 data from different countries were conducted in this study to compare and evaluate various regression models. Results showed that the extreme random forest regression (ERFR) model had the best performance, and factors such as population density, ozone, median age, life expectancy, and Human Development Index (HDI) were relatively influential on the spread and diffusion of COVID-19 in the ERFR model. In addition, the epidemic clustering characteristics were analyzed through the spectral clustering algorithm. The visualization results of spectral clustering showed that the geographical distribution of global COVID-19 pandemic spread formation was highly clustered, and its clustering characteristics and influencing factors also exhibited some consistency in distribution. This study aims to deepen the understanding of the international community regarding the global COVID-19 pandemic to develop measures for countries worldwide to mitigate potential large-scale outbreaks and improve the ability to respond to such public health emergencies.\",\"PeriodicalId\":52355,\"journal\":{\"name\":\"Big Data Mining and Analytics\",\"volume\":\"5 4\",\"pages\":\"318-338\"},\"PeriodicalIF\":7.7000,\"publicationDate\":\"2022-07-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/iel7/8254253/9832761/09832767.pdf\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Big Data Mining and Analytics\",\"FirstCategoryId\":\"1093\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/9832767/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Big Data Mining and Analytics","FirstCategoryId":"1093","ListUrlMain":"https://ieeexplore.ieee.org/document/9832767/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Influencing Factors and Clustering Characteristics of COVID-19: A Global Analysis
The unprecedented coronavirus disease 2019 (COVID-19) pandemic is still raging (in year 2021) in many countries worldwide. Various response strategies to study the characteristics and distributions of the virus in various regions of the world have been developed to assist in the prevention and control of this epidemic. Descriptive statistics and regression analysis on COVID-19 data from different countries were conducted in this study to compare and evaluate various regression models. Results showed that the extreme random forest regression (ERFR) model had the best performance, and factors such as population density, ozone, median age, life expectancy, and Human Development Index (HDI) were relatively influential on the spread and diffusion of COVID-19 in the ERFR model. In addition, the epidemic clustering characteristics were analyzed through the spectral clustering algorithm. The visualization results of spectral clustering showed that the geographical distribution of global COVID-19 pandemic spread formation was highly clustered, and its clustering characteristics and influencing factors also exhibited some consistency in distribution. This study aims to deepen the understanding of the international community regarding the global COVID-19 pandemic to develop measures for countries worldwide to mitigate potential large-scale outbreaks and improve the ability to respond to such public health emergencies.
期刊介绍:
Big Data Mining and Analytics, a publication by Tsinghua University Press, presents groundbreaking research in the field of big data research and its applications. This comprehensive book delves into the exploration and analysis of vast amounts of data from diverse sources to uncover hidden patterns, correlations, insights, and knowledge.
Featuring the latest developments, research issues, and solutions, this book offers valuable insights into the world of big data. It provides a deep understanding of data mining techniques, data analytics, and their practical applications.
Big Data Mining and Analytics has gained significant recognition and is indexed and abstracted in esteemed platforms such as ESCI, EI, Scopus, DBLP Computer Science, Google Scholar, INSPEC, CSCD, DOAJ, CNKI, and more.
With its wealth of information and its ability to transform the way we perceive and utilize data, this book is a must-read for researchers, professionals, and anyone interested in the field of big data analytics.