Machine learning models to predict the COVID-19 reproduction rate: combining non-pharmaceutical interventions with sociodemographic and cultural characteristics.
Margarida Duarte, Catarina Ferreira da Silva, Sérgio Moro
{"title":"Machine learning models to predict the COVID-19 reproduction rate: combining non-pharmaceutical interventions with sociodemographic and cultural characteristics.","authors":"Margarida Duarte, Catarina Ferreira da Silva, Sérgio Moro","doi":"10.1080/17538157.2025.2491517","DOIUrl":null,"url":null,"abstract":"<p><p>Since the beginning of the COVID-19 pandemic, countries worldwide have implemented a set of Non-Pharmaceutical Interventions (NPIs) to prevent the dissemination of the pandemic. Few studies applied machine learning models to compare the use of NPIs, socioeconomic and demographic characteristics, and cultural dimensions in predicting the reproduction rate R<sub>t</sub>. We adopted the CRISP-DM methodology using as data sources the \"Our World in Data COVID-19,\" the \"Oxford COVID-19 Government Response Tracker\" and the Hofstede Insights data. We analyzed the impact that Hofstede's cultural dimensions, the implementation of various degrees of restriction of NPIs and the sociodemographic variables may have in the reproduction rate by applying machine learning models to understand whether cultural characteristics are useful information to improve reproduction rate predictions. We included data from 101 countries to train several machine learning models to compare the results between the models with and without Hofstede's cultural dimensions. Our results show the use of cultural dimensions helps to improve the models, and that the ones that obtained a better prediction of the R<sub>t</sub> were the ensemble models, especially the Random Forest.</p>","PeriodicalId":101409,"journal":{"name":"Informatics for health & social care","volume":" ","pages":"81-99"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Informatics for health & social care","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/17538157.2025.2491517","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/4/29 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Since the beginning of the COVID-19 pandemic, countries worldwide have implemented a set of Non-Pharmaceutical Interventions (NPIs) to prevent the dissemination of the pandemic. Few studies applied machine learning models to compare the use of NPIs, socioeconomic and demographic characteristics, and cultural dimensions in predicting the reproduction rate Rt. We adopted the CRISP-DM methodology using as data sources the "Our World in Data COVID-19," the "Oxford COVID-19 Government Response Tracker" and the Hofstede Insights data. We analyzed the impact that Hofstede's cultural dimensions, the implementation of various degrees of restriction of NPIs and the sociodemographic variables may have in the reproduction rate by applying machine learning models to understand whether cultural characteristics are useful information to improve reproduction rate predictions. We included data from 101 countries to train several machine learning models to compare the results between the models with and without Hofstede's cultural dimensions. Our results show the use of cultural dimensions helps to improve the models, and that the ones that obtained a better prediction of the Rt were the ensemble models, especially the Random Forest.