Fatemeh Rezaeitavabe , Karen T. Coschigano , Guy Riefler
{"title":"预测俄亥俄州的COVID-19:来自废水、人口和社会经济数据的见解","authors":"Fatemeh Rezaeitavabe , Karen T. Coschigano , Guy Riefler","doi":"10.1016/j.scitotenv.2025.178938","DOIUrl":null,"url":null,"abstract":"<div><div>More than four years into the COVID-19 pandemic, clear patterns have emerged showing that the virus does not affect all populations uniformly. Demographic and socioeconomic disparities play a significant role in the vulnerability to and spread of SARS-CoV-2. Analyzing these disparities can offer insights into the pandemic's dynamics, helping to identify critical factors that need to be addressed in efforts to mitigate the pandemic's impact globally. Wastewater-based surveillance (WBS), a crucial tool for tracking the virus, offers a unique perspective on how socioeconomic and demographic factors might influence infection rates across different communities. However, estimating and predicting the extent of the epidemic from WBS results is still challenging. In our study, we tried to address these challenges by analyzing data from 55 sites in Ohio, USA, with populations ranging from 3300 to 654,817, to better understand the pandemic's dynamics and WBS effectiveness in monitoring COVID-19 spread. Factors such as population size, poverty rate, racial demographics (specifically white and black populations), and median income showed the strongest correlations with both clinical cases and wastewater results, with population size being the most important factor. Moreover, among eight evaluated machine learning models, k-Nearest Neighbors (R<sup>2</sup> = 0.873), Random Forest (R<sup>2</sup> = 0.862), and XGBoost (R<sup>2</sup> = 0.854) were the most effective in predicting clinical cases from WBS data across demographic and socioeconomic categories, while Linear (R<sup>2</sup> = 0.578) and Ridge+Linear (R<sup>2</sup> = 0.595) were least effective. Thus, these findings highlight the potential of machine learning to predict COVID-19 cases from WBS data across a wide range of demographic and socioeconomic categories.</div></div>","PeriodicalId":422,"journal":{"name":"Science of the Total Environment","volume":"969 ","pages":"Article 178938"},"PeriodicalIF":8.0000,"publicationDate":"2025-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Predicting COVID-19 in Ohio: Insights from wastewater, demographic and socioeconomic data\",\"authors\":\"Fatemeh Rezaeitavabe , Karen T. Coschigano , Guy Riefler\",\"doi\":\"10.1016/j.scitotenv.2025.178938\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>More than four years into the COVID-19 pandemic, clear patterns have emerged showing that the virus does not affect all populations uniformly. Demographic and socioeconomic disparities play a significant role in the vulnerability to and spread of SARS-CoV-2. Analyzing these disparities can offer insights into the pandemic's dynamics, helping to identify critical factors that need to be addressed in efforts to mitigate the pandemic's impact globally. Wastewater-based surveillance (WBS), a crucial tool for tracking the virus, offers a unique perspective on how socioeconomic and demographic factors might influence infection rates across different communities. However, estimating and predicting the extent of the epidemic from WBS results is still challenging. In our study, we tried to address these challenges by analyzing data from 55 sites in Ohio, USA, with populations ranging from 3300 to 654,817, to better understand the pandemic's dynamics and WBS effectiveness in monitoring COVID-19 spread. Factors such as population size, poverty rate, racial demographics (specifically white and black populations), and median income showed the strongest correlations with both clinical cases and wastewater results, with population size being the most important factor. Moreover, among eight evaluated machine learning models, k-Nearest Neighbors (R<sup>2</sup> = 0.873), Random Forest (R<sup>2</sup> = 0.862), and XGBoost (R<sup>2</sup> = 0.854) were the most effective in predicting clinical cases from WBS data across demographic and socioeconomic categories, while Linear (R<sup>2</sup> = 0.578) and Ridge+Linear (R<sup>2</sup> = 0.595) were least effective. Thus, these findings highlight the potential of machine learning to predict COVID-19 cases from WBS data across a wide range of demographic and socioeconomic categories.</div></div>\",\"PeriodicalId\":422,\"journal\":{\"name\":\"Science of the Total Environment\",\"volume\":\"969 \",\"pages\":\"Article 178938\"},\"PeriodicalIF\":8.0000,\"publicationDate\":\"2025-02-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Science of the Total Environment\",\"FirstCategoryId\":\"93\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S004896972500573X\",\"RegionNum\":1,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENVIRONMENTAL SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Science of the Total Environment","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S004896972500573X","RegionNum":1,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
Predicting COVID-19 in Ohio: Insights from wastewater, demographic and socioeconomic data
More than four years into the COVID-19 pandemic, clear patterns have emerged showing that the virus does not affect all populations uniformly. Demographic and socioeconomic disparities play a significant role in the vulnerability to and spread of SARS-CoV-2. Analyzing these disparities can offer insights into the pandemic's dynamics, helping to identify critical factors that need to be addressed in efforts to mitigate the pandemic's impact globally. Wastewater-based surveillance (WBS), a crucial tool for tracking the virus, offers a unique perspective on how socioeconomic and demographic factors might influence infection rates across different communities. However, estimating and predicting the extent of the epidemic from WBS results is still challenging. In our study, we tried to address these challenges by analyzing data from 55 sites in Ohio, USA, with populations ranging from 3300 to 654,817, to better understand the pandemic's dynamics and WBS effectiveness in monitoring COVID-19 spread. Factors such as population size, poverty rate, racial demographics (specifically white and black populations), and median income showed the strongest correlations with both clinical cases and wastewater results, with population size being the most important factor. Moreover, among eight evaluated machine learning models, k-Nearest Neighbors (R2 = 0.873), Random Forest (R2 = 0.862), and XGBoost (R2 = 0.854) were the most effective in predicting clinical cases from WBS data across demographic and socioeconomic categories, while Linear (R2 = 0.578) and Ridge+Linear (R2 = 0.595) were least effective. Thus, these findings highlight the potential of machine learning to predict COVID-19 cases from WBS data across a wide range of demographic and socioeconomic categories.
期刊介绍:
The Science of the Total Environment is an international journal dedicated to scientific research on the environment and its interaction with humanity. It covers a wide range of disciplines and seeks to publish innovative, hypothesis-driven, and impactful research that explores the entire environment, including the atmosphere, lithosphere, hydrosphere, biosphere, and anthroposphere.
The journal's updated Aims & Scope emphasizes the importance of interdisciplinary environmental research with broad impact. Priority is given to studies that advance fundamental understanding and explore the interconnectedness of multiple environmental spheres. Field studies are preferred, while laboratory experiments must demonstrate significant methodological advancements or mechanistic insights with direct relevance to the environment.