评估登革热预测方法:巴西巴西里约热内卢统计模型和机器学习技术的比较研究

IF 3.6 Q1 TROPICAL MEDICINE
Xiang Chen, Paula Moraga
{"title":"评估登革热预测方法:巴西巴西里约热内卢统计模型和机器学习技术的比较研究","authors":"Xiang Chen, Paula Moraga","doi":"10.1186/s41182-025-00723-7","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Dengue is a mosquito-borne viral disease that poses a significant public health threat in tropical and subtropical regions worldwide. Accurate forecasting of dengue outbreaks is crucial for effective public health planning and intervention. This study aims to assess the predictive performance and computational efficiency of a number of statistical models and machine learning techniques for dengue forecasting, both with and without the inclusion of climate factors, to inform the design of dengue surveillance systems.</p><p><strong>Methods: </strong>The dengue forecasting methods comparison in this study considers dengue cases in Rio de Janeiro, Brazil, as well as climate factors known to affect disease transmission. Employing a dynamic window approach, various statistical methods and machine learning techniques were used to generate weekly forecasts at several time horizons. Error measures, uncertainty intervals, and computational efficiency obtained with each method were compared. Statistical models considered were Autoregressive (AR), Moving Average (MA), Autoregressive Integrated Moving Average (ARIMA), and Exponential Smoothing State Space Model (ETS). In addition, models incorporating temperature and humidity as covariates, such as Vector Autoregression (VAR) and Seasonal ARIMAX (SARIMAX), were employed. Machine learning techniques evaluated were Random Forest, XGBoost, Support Vector Machine (SVM), Long-Short-Term Memory (LSTM) networks, and Prophet. Ensemble approaches that integrated the top performing methods were also considered. The evaluated methods also incorporated lagged climatic variables to account for delayed effects.</p><p><strong>Results: </strong>Among the statistical models, ARIMA demonstrated the best performance using only historical case data, while SARIMAX significantly improved predictive accuracy by incorporating climate covariates. In general, the LSTM model, particularly when combined with climate covariates, proved to be the most accurate machine learning model, despite being slower to train and predict. For long-term forecasts, Prophet with climate covariates was the most effective. Ensemble models, such as the combination of LSTM and ARIMA, showed substantial improvements over individual models.</p><p><strong>Conclusions: </strong>This study demonstrates the strengths and limitations of various methods for dengue forecasting across multiple timeframes. It highlights the best-performing statistical and machine learning methods, including their computational efficiency, underscoring the significance of machine learning techniques and the integration of climate covariates to improve forecasts. These findings offer valuable insights for public health officials, facilitating the development of dengue surveillance systems for more accurate forecasting and timely allocation of resources to mitigate dengue outbreaks.</p>","PeriodicalId":23311,"journal":{"name":"Tropical Medicine and Health","volume":"53 1","pages":"52"},"PeriodicalIF":3.6000,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11984044/pdf/","citationCount":"0","resultStr":"{\"title\":\"Assessing dengue forecasting methods: a comparative study of statistical models and machine learning techniques in Rio de Janeiro, Brazil.\",\"authors\":\"Xiang Chen, Paula Moraga\",\"doi\":\"10.1186/s41182-025-00723-7\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Dengue is a mosquito-borne viral disease that poses a significant public health threat in tropical and subtropical regions worldwide. Accurate forecasting of dengue outbreaks is crucial for effective public health planning and intervention. This study aims to assess the predictive performance and computational efficiency of a number of statistical models and machine learning techniques for dengue forecasting, both with and without the inclusion of climate factors, to inform the design of dengue surveillance systems.</p><p><strong>Methods: </strong>The dengue forecasting methods comparison in this study considers dengue cases in Rio de Janeiro, Brazil, as well as climate factors known to affect disease transmission. Employing a dynamic window approach, various statistical methods and machine learning techniques were used to generate weekly forecasts at several time horizons. Error measures, uncertainty intervals, and computational efficiency obtained with each method were compared. Statistical models considered were Autoregressive (AR), Moving Average (MA), Autoregressive Integrated Moving Average (ARIMA), and Exponential Smoothing State Space Model (ETS). In addition, models incorporating temperature and humidity as covariates, such as Vector Autoregression (VAR) and Seasonal ARIMAX (SARIMAX), were employed. Machine learning techniques evaluated were Random Forest, XGBoost, Support Vector Machine (SVM), Long-Short-Term Memory (LSTM) networks, and Prophet. Ensemble approaches that integrated the top performing methods were also considered. The evaluated methods also incorporated lagged climatic variables to account for delayed effects.</p><p><strong>Results: </strong>Among the statistical models, ARIMA demonstrated the best performance using only historical case data, while SARIMAX significantly improved predictive accuracy by incorporating climate covariates. In general, the LSTM model, particularly when combined with climate covariates, proved to be the most accurate machine learning model, despite being slower to train and predict. For long-term forecasts, Prophet with climate covariates was the most effective. Ensemble models, such as the combination of LSTM and ARIMA, showed substantial improvements over individual models.</p><p><strong>Conclusions: </strong>This study demonstrates the strengths and limitations of various methods for dengue forecasting across multiple timeframes. It highlights the best-performing statistical and machine learning methods, including their computational efficiency, underscoring the significance of machine learning techniques and the integration of climate covariates to improve forecasts. These findings offer valuable insights for public health officials, facilitating the development of dengue surveillance systems for more accurate forecasting and timely allocation of resources to mitigate dengue outbreaks.</p>\",\"PeriodicalId\":23311,\"journal\":{\"name\":\"Tropical Medicine and Health\",\"volume\":\"53 1\",\"pages\":\"52\"},\"PeriodicalIF\":3.6000,\"publicationDate\":\"2025-04-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11984044/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Tropical Medicine and Health\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1186/s41182-025-00723-7\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"TROPICAL MEDICINE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Tropical Medicine and Health","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s41182-025-00723-7","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"TROPICAL MEDICINE","Score":null,"Total":0}
引用次数: 0

摘要

背景:登革热是一种蚊媒病毒性疾病,对全球热带和亚热带地区的公共卫生构成重大威胁。准确预测登革热疫情对于有效的公共卫生规划和干预至关重要。本研究旨在评估一些统计模型和机器学习技术用于登革热预测的预测性能和计算效率,包括和不包括气候因素,以便为登革热监测系统的设计提供信息。方法:本研究中登革热预测方法的比较考虑了巴西里约热内卢里约热内卢的登革热病例,以及已知影响疾病传播的气候因素。采用动态窗口方法,使用各种统计方法和机器学习技术在几个时间范围内生成每周预测。比较了每种方法的误差度量、不确定区间和计算效率。考虑的统计模型有自回归(AR)、移动平均(MA)、自回归综合移动平均(ARIMA)和指数平滑状态空间模型(ETS)。此外,还采用了以温度和湿度为协变量的向量自回归(VAR)和季节性ARIMAX (SARIMAX)模型。评估的机器学习技术包括随机森林、XGBoost、支持向量机(SVM)、长短期记忆(LSTM)网络和Prophet。还考虑了集成性能最好的方法的集成方法。评估的方法还纳入了滞后的气候变量来解释延迟效应。结果:在统计模型中,仅使用历史案例数据的ARIMA表现最好,而SARIMAX通过纳入气候协变量显著提高了预测精度。总的来说,LSTM模型,特别是当与气候协变量相结合时,被证明是最准确的机器学习模型,尽管训练和预测速度较慢。对于长期预测,具有气候协变量的Prophet是最有效的。集成模型,例如LSTM和ARIMA的组合,比单独的模型有了实质性的改进。结论:本研究证明了各种登革热预测方法在多个时间范围内的优势和局限性。它强调了表现最好的统计和机器学习方法,包括它们的计算效率,强调了机器学习技术和气候协变量集成对改进预测的重要性。这些发现为公共卫生官员提供了宝贵的见解,促进了登革热监测系统的发展,以便更准确地预测和及时分配资源,以减轻登革热疫情。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Assessing dengue forecasting methods: a comparative study of statistical models and machine learning techniques in Rio de Janeiro, Brazil.

Background: Dengue is a mosquito-borne viral disease that poses a significant public health threat in tropical and subtropical regions worldwide. Accurate forecasting of dengue outbreaks is crucial for effective public health planning and intervention. This study aims to assess the predictive performance and computational efficiency of a number of statistical models and machine learning techniques for dengue forecasting, both with and without the inclusion of climate factors, to inform the design of dengue surveillance systems.

Methods: The dengue forecasting methods comparison in this study considers dengue cases in Rio de Janeiro, Brazil, as well as climate factors known to affect disease transmission. Employing a dynamic window approach, various statistical methods and machine learning techniques were used to generate weekly forecasts at several time horizons. Error measures, uncertainty intervals, and computational efficiency obtained with each method were compared. Statistical models considered were Autoregressive (AR), Moving Average (MA), Autoregressive Integrated Moving Average (ARIMA), and Exponential Smoothing State Space Model (ETS). In addition, models incorporating temperature and humidity as covariates, such as Vector Autoregression (VAR) and Seasonal ARIMAX (SARIMAX), were employed. Machine learning techniques evaluated were Random Forest, XGBoost, Support Vector Machine (SVM), Long-Short-Term Memory (LSTM) networks, and Prophet. Ensemble approaches that integrated the top performing methods were also considered. The evaluated methods also incorporated lagged climatic variables to account for delayed effects.

Results: Among the statistical models, ARIMA demonstrated the best performance using only historical case data, while SARIMAX significantly improved predictive accuracy by incorporating climate covariates. In general, the LSTM model, particularly when combined with climate covariates, proved to be the most accurate machine learning model, despite being slower to train and predict. For long-term forecasts, Prophet with climate covariates was the most effective. Ensemble models, such as the combination of LSTM and ARIMA, showed substantial improvements over individual models.

Conclusions: This study demonstrates the strengths and limitations of various methods for dengue forecasting across multiple timeframes. It highlights the best-performing statistical and machine learning methods, including their computational efficiency, underscoring the significance of machine learning techniques and the integration of climate covariates to improve forecasts. These findings offer valuable insights for public health officials, facilitating the development of dengue surveillance systems for more accurate forecasting and timely allocation of resources to mitigate dengue outbreaks.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Tropical Medicine and Health
Tropical Medicine and Health TROPICAL MEDICINE-
CiteScore
7.00
自引率
2.20%
发文量
90
审稿时长
11 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信