Interpretation of COVID-19 Epidemiological Trends in Mexico Through Wastewater Surveillance Using Simple Machine Learning Algorithms for Rapid Decision-Making.
Arnoldo Armenta-Castro, Orlando de la Rosa, Alberto Aguayo-Acosta, Mariel Araceli Oyervides-Muñoz, Antonio Flores-Tlacuahuac, Roberto Parra-Saldívar, Juan Eduardo Sosa-Hernández
{"title":"Interpretation of COVID-19 Epidemiological Trends in Mexico Through Wastewater Surveillance Using Simple Machine Learning Algorithms for Rapid Decision-Making.","authors":"Arnoldo Armenta-Castro, Orlando de la Rosa, Alberto Aguayo-Acosta, Mariel Araceli Oyervides-Muñoz, Antonio Flores-Tlacuahuac, Roberto Parra-Saldívar, Juan Eduardo Sosa-Hernández","doi":"10.3390/v17010109","DOIUrl":null,"url":null,"abstract":"<p><p>Detection and quantification of disease-related biomarkers in wastewater samples, denominated Wastewater-based Surveillance (WBS), has proven a valuable strategy for studying the prevalence of infectious diseases within populations in a time- and resource-efficient manner, as wastewater samples are representative of all cases within the catchment area, whether they are clinically reported or not. However, analysis and interpretation of WBS datasets for decision-making during public health emergencies, such as the COVID-19 pandemic, remains an area of opportunity. In this article, a database obtained from wastewater sampling at wastewater treatment plants (WWTPs) and university campuses in Monterrey and Mexico City between 2021 and 2022 was used to train simple clustering- and regression-based risk assessment models to allow for informed prevention and control measures in high-affluence facilities, even if working with low-dimensionality datasets and a limited number of observations. When dividing weekly data points based on whether the seven-day average daily new COVID-19 cases were above a certain threshold, the resulting clustering model could differentiate between weeks with surges in clinical reports and periods between them with an 87.9% accuracy rate. Moreover, the clustering model provided satisfactory forecasts one week (80.4% accuracy) and two weeks (81.8%) into the future. However, the prediction of the weekly average of new daily cases was limited (R<sup>2</sup> = 0.80, MAPE = 72.6%), likely because of insufficient dimensionality in the database. Overall, while simple, WBS-supported models can provide relevant insights for decision-makers during epidemiological outbreaks, regression algorithms for prediction using low-dimensionality datasets can still be improved.</p>","PeriodicalId":49328,"journal":{"name":"Viruses-Basel","volume":"17 1","pages":""},"PeriodicalIF":3.8000,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11768489/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Viruses-Basel","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3390/v17010109","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"VIROLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Detection and quantification of disease-related biomarkers in wastewater samples, denominated Wastewater-based Surveillance (WBS), has proven a valuable strategy for studying the prevalence of infectious diseases within populations in a time- and resource-efficient manner, as wastewater samples are representative of all cases within the catchment area, whether they are clinically reported or not. However, analysis and interpretation of WBS datasets for decision-making during public health emergencies, such as the COVID-19 pandemic, remains an area of opportunity. In this article, a database obtained from wastewater sampling at wastewater treatment plants (WWTPs) and university campuses in Monterrey and Mexico City between 2021 and 2022 was used to train simple clustering- and regression-based risk assessment models to allow for informed prevention and control measures in high-affluence facilities, even if working with low-dimensionality datasets and a limited number of observations. When dividing weekly data points based on whether the seven-day average daily new COVID-19 cases were above a certain threshold, the resulting clustering model could differentiate between weeks with surges in clinical reports and periods between them with an 87.9% accuracy rate. Moreover, the clustering model provided satisfactory forecasts one week (80.4% accuracy) and two weeks (81.8%) into the future. However, the prediction of the weekly average of new daily cases was limited (R2 = 0.80, MAPE = 72.6%), likely because of insufficient dimensionality in the database. Overall, while simple, WBS-supported models can provide relevant insights for decision-makers during epidemiological outbreaks, regression algorithms for prediction using low-dimensionality datasets can still be improved.
期刊介绍:
Viruses (ISSN 1999-4915) is an open access journal which provides an advanced forum for studies of viruses. It publishes reviews, regular research papers, communications, conference reports and short notes. Our aim is to encourage scientists to publish their experimental and theoretical results in as much detail as possible. There is no restriction on the length of the papers. The full experimental details must be provided so that the results can be reproduced. We also encourage the publication of timely reviews and commentaries on topics of interest to the virology community and feature highlights from the virology literature in the ''News and Views'' section. Electronic files or software regarding the full details of the calculation and experimental procedure, if unable to be published in a normal way, can be deposited as supplementary material.