预测俄亥俄州的COVID-19:来自废水、人口和社会经济数据的见解

IF 8 1区 环境科学与生态学 Q1 ENVIRONMENTAL SCIENCES
Fatemeh Rezaeitavabe , Karen T. Coschigano , Guy Riefler
{"title":"预测俄亥俄州的COVID-19:来自废水、人口和社会经济数据的见解","authors":"Fatemeh Rezaeitavabe ,&nbsp;Karen T. Coschigano ,&nbsp;Guy Riefler","doi":"10.1016/j.scitotenv.2025.178938","DOIUrl":null,"url":null,"abstract":"<div><div>More than four years into the COVID-19 pandemic, clear patterns have emerged showing that the virus does not affect all populations uniformly. Demographic and socioeconomic disparities play a significant role in the vulnerability to and spread of SARS-CoV-2. Analyzing these disparities can offer insights into the pandemic's dynamics, helping to identify critical factors that need to be addressed in efforts to mitigate the pandemic's impact globally. Wastewater-based surveillance (WBS), a crucial tool for tracking the virus, offers a unique perspective on how socioeconomic and demographic factors might influence infection rates across different communities. However, estimating and predicting the extent of the epidemic from WBS results is still challenging. In our study, we tried to address these challenges by analyzing data from 55 sites in Ohio, USA, with populations ranging from 3300 to 654,817, to better understand the pandemic's dynamics and WBS effectiveness in monitoring COVID-19 spread. Factors such as population size, poverty rate, racial demographics (specifically white and black populations), and median income showed the strongest correlations with both clinical cases and wastewater results, with population size being the most important factor. Moreover, among eight evaluated machine learning models, k-Nearest Neighbors (R<sup>2</sup> = 0.873), Random Forest (R<sup>2</sup> = 0.862), and XGBoost (R<sup>2</sup> = 0.854) were the most effective in predicting clinical cases from WBS data across demographic and socioeconomic categories, while Linear (R<sup>2</sup> = 0.578) and Ridge+Linear (R<sup>2</sup> = 0.595) were least effective. Thus, these findings highlight the potential of machine learning to predict COVID-19 cases from WBS data across a wide range of demographic and socioeconomic categories.</div></div>","PeriodicalId":422,"journal":{"name":"Science of the Total Environment","volume":"969 ","pages":"Article 178938"},"PeriodicalIF":8.0000,"publicationDate":"2025-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Predicting COVID-19 in Ohio: Insights from wastewater, demographic and socioeconomic data\",\"authors\":\"Fatemeh Rezaeitavabe ,&nbsp;Karen T. Coschigano ,&nbsp;Guy Riefler\",\"doi\":\"10.1016/j.scitotenv.2025.178938\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>More than four years into the COVID-19 pandemic, clear patterns have emerged showing that the virus does not affect all populations uniformly. Demographic and socioeconomic disparities play a significant role in the vulnerability to and spread of SARS-CoV-2. Analyzing these disparities can offer insights into the pandemic's dynamics, helping to identify critical factors that need to be addressed in efforts to mitigate the pandemic's impact globally. Wastewater-based surveillance (WBS), a crucial tool for tracking the virus, offers a unique perspective on how socioeconomic and demographic factors might influence infection rates across different communities. However, estimating and predicting the extent of the epidemic from WBS results is still challenging. In our study, we tried to address these challenges by analyzing data from 55 sites in Ohio, USA, with populations ranging from 3300 to 654,817, to better understand the pandemic's dynamics and WBS effectiveness in monitoring COVID-19 spread. Factors such as population size, poverty rate, racial demographics (specifically white and black populations), and median income showed the strongest correlations with both clinical cases and wastewater results, with population size being the most important factor. Moreover, among eight evaluated machine learning models, k-Nearest Neighbors (R<sup>2</sup> = 0.873), Random Forest (R<sup>2</sup> = 0.862), and XGBoost (R<sup>2</sup> = 0.854) were the most effective in predicting clinical cases from WBS data across demographic and socioeconomic categories, while Linear (R<sup>2</sup> = 0.578) and Ridge+Linear (R<sup>2</sup> = 0.595) were least effective. Thus, these findings highlight the potential of machine learning to predict COVID-19 cases from WBS data across a wide range of demographic and socioeconomic categories.</div></div>\",\"PeriodicalId\":422,\"journal\":{\"name\":\"Science of the Total Environment\",\"volume\":\"969 \",\"pages\":\"Article 178938\"},\"PeriodicalIF\":8.0000,\"publicationDate\":\"2025-02-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Science of the Total Environment\",\"FirstCategoryId\":\"93\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S004896972500573X\",\"RegionNum\":1,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENVIRONMENTAL SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Science of the Total Environment","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S004896972500573X","RegionNum":1,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0

摘要

COVID-19大流行四年多来,已经出现了明确的模式,表明该病毒并非均匀地影响所有人群。人口和社会经济差异在SARS-CoV-2的易感性和传播方面发挥着重要作用。分析这些差异可以深入了解大流行的动态,有助于确定在减轻全球大流行影响的努力中需要解决的关键因素。基于废水的监测(WBS)是追踪病毒的重要工具,它提供了一个独特的视角,可以了解社会经济和人口因素如何影响不同社区的感染率。然而,根据WBS结果估计和预测流行病的程度仍然具有挑战性。在我们的研究中,我们试图通过分析来自美国俄亥俄州55个站点的数据来解决这些挑战,这些站点的人口范围从3300到654,817,以更好地了解大流行的动态和WBS监测COVID-19传播的有效性。人口规模、贫困率、种族人口统计(特别是白人和黑人人口)和收入中位数等因素与临床病例和废水结果的相关性最强,其中人口规模是最重要的因素。此外,在8个评估的机器学习模型中,k-Nearest Neighbors (R2 = 0.873)、Random Forest (R2 = 0.862)和XGBoost (R2 = 0.854)在从人口统计学和社会经济类别的WBS数据预测临床病例方面最有效,而Linear (R2 = 0.578)和Ridge+Linear (R2 = 0.595)的效果最差。因此,这些发现突出了机器学习在从广泛的人口和社会经济类别的WBS数据预测COVID-19病例方面的潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Predicting COVID-19 in Ohio: Insights from wastewater, demographic and socioeconomic data

Predicting COVID-19 in Ohio: Insights from wastewater, demographic and socioeconomic data
More than four years into the COVID-19 pandemic, clear patterns have emerged showing that the virus does not affect all populations uniformly. Demographic and socioeconomic disparities play a significant role in the vulnerability to and spread of SARS-CoV-2. Analyzing these disparities can offer insights into the pandemic's dynamics, helping to identify critical factors that need to be addressed in efforts to mitigate the pandemic's impact globally. Wastewater-based surveillance (WBS), a crucial tool for tracking the virus, offers a unique perspective on how socioeconomic and demographic factors might influence infection rates across different communities. However, estimating and predicting the extent of the epidemic from WBS results is still challenging. In our study, we tried to address these challenges by analyzing data from 55 sites in Ohio, USA, with populations ranging from 3300 to 654,817, to better understand the pandemic's dynamics and WBS effectiveness in monitoring COVID-19 spread. Factors such as population size, poverty rate, racial demographics (specifically white and black populations), and median income showed the strongest correlations with both clinical cases and wastewater results, with population size being the most important factor. Moreover, among eight evaluated machine learning models, k-Nearest Neighbors (R2 = 0.873), Random Forest (R2 = 0.862), and XGBoost (R2 = 0.854) were the most effective in predicting clinical cases from WBS data across demographic and socioeconomic categories, while Linear (R2 = 0.578) and Ridge+Linear (R2 = 0.595) were least effective. Thus, these findings highlight the potential of machine learning to predict COVID-19 cases from WBS data across a wide range of demographic and socioeconomic categories.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Science of the Total Environment
Science of the Total Environment 环境科学-环境科学
CiteScore
17.60
自引率
10.20%
发文量
8726
审稿时长
2.4 months
期刊介绍: The Science of the Total Environment is an international journal dedicated to scientific research on the environment and its interaction with humanity. It covers a wide range of disciplines and seeks to publish innovative, hypothesis-driven, and impactful research that explores the entire environment, including the atmosphere, lithosphere, hydrosphere, biosphere, and anthroposphere. The journal's updated Aims & Scope emphasizes the importance of interdisciplinary environmental research with broad impact. Priority is given to studies that advance fundamental understanding and explore the interconnectedness of multiple environmental spheres. Field studies are preferred, while laboratory experiments must demonstrate significant methodological advancements or mechanistic insights with direct relevance to the environment.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信