{"title":"评估登革热预测方法:巴西巴西里约热内卢统计模型和机器学习技术的比较研究","authors":"Xiang Chen, Paula Moraga","doi":"10.1186/s41182-025-00723-7","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Dengue is a mosquito-borne viral disease that poses a significant public health threat in tropical and subtropical regions worldwide. Accurate forecasting of dengue outbreaks is crucial for effective public health planning and intervention. This study aims to assess the predictive performance and computational efficiency of a number of statistical models and machine learning techniques for dengue forecasting, both with and without the inclusion of climate factors, to inform the design of dengue surveillance systems.</p><p><strong>Methods: </strong>The dengue forecasting methods comparison in this study considers dengue cases in Rio de Janeiro, Brazil, as well as climate factors known to affect disease transmission. Employing a dynamic window approach, various statistical methods and machine learning techniques were used to generate weekly forecasts at several time horizons. Error measures, uncertainty intervals, and computational efficiency obtained with each method were compared. Statistical models considered were Autoregressive (AR), Moving Average (MA), Autoregressive Integrated Moving Average (ARIMA), and Exponential Smoothing State Space Model (ETS). In addition, models incorporating temperature and humidity as covariates, such as Vector Autoregression (VAR) and Seasonal ARIMAX (SARIMAX), were employed. Machine learning techniques evaluated were Random Forest, XGBoost, Support Vector Machine (SVM), Long-Short-Term Memory (LSTM) networks, and Prophet. Ensemble approaches that integrated the top performing methods were also considered. The evaluated methods also incorporated lagged climatic variables to account for delayed effects.</p><p><strong>Results: </strong>Among the statistical models, ARIMA demonstrated the best performance using only historical case data, while SARIMAX significantly improved predictive accuracy by incorporating climate covariates. In general, the LSTM model, particularly when combined with climate covariates, proved to be the most accurate machine learning model, despite being slower to train and predict. For long-term forecasts, Prophet with climate covariates was the most effective. Ensemble models, such as the combination of LSTM and ARIMA, showed substantial improvements over individual models.</p><p><strong>Conclusions: </strong>This study demonstrates the strengths and limitations of various methods for dengue forecasting across multiple timeframes. It highlights the best-performing statistical and machine learning methods, including their computational efficiency, underscoring the significance of machine learning techniques and the integration of climate covariates to improve forecasts. These findings offer valuable insights for public health officials, facilitating the development of dengue surveillance systems for more accurate forecasting and timely allocation of resources to mitigate dengue outbreaks.</p>","PeriodicalId":23311,"journal":{"name":"Tropical Medicine and Health","volume":"53 1","pages":"52"},"PeriodicalIF":3.6000,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11984044/pdf/","citationCount":"0","resultStr":"{\"title\":\"Assessing dengue forecasting methods: a comparative study of statistical models and machine learning techniques in Rio de Janeiro, Brazil.\",\"authors\":\"Xiang Chen, Paula Moraga\",\"doi\":\"10.1186/s41182-025-00723-7\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Dengue is a mosquito-borne viral disease that poses a significant public health threat in tropical and subtropical regions worldwide. Accurate forecasting of dengue outbreaks is crucial for effective public health planning and intervention. This study aims to assess the predictive performance and computational efficiency of a number of statistical models and machine learning techniques for dengue forecasting, both with and without the inclusion of climate factors, to inform the design of dengue surveillance systems.</p><p><strong>Methods: </strong>The dengue forecasting methods comparison in this study considers dengue cases in Rio de Janeiro, Brazil, as well as climate factors known to affect disease transmission. Employing a dynamic window approach, various statistical methods and machine learning techniques were used to generate weekly forecasts at several time horizons. Error measures, uncertainty intervals, and computational efficiency obtained with each method were compared. Statistical models considered were Autoregressive (AR), Moving Average (MA), Autoregressive Integrated Moving Average (ARIMA), and Exponential Smoothing State Space Model (ETS). In addition, models incorporating temperature and humidity as covariates, such as Vector Autoregression (VAR) and Seasonal ARIMAX (SARIMAX), were employed. Machine learning techniques evaluated were Random Forest, XGBoost, Support Vector Machine (SVM), Long-Short-Term Memory (LSTM) networks, and Prophet. Ensemble approaches that integrated the top performing methods were also considered. The evaluated methods also incorporated lagged climatic variables to account for delayed effects.</p><p><strong>Results: </strong>Among the statistical models, ARIMA demonstrated the best performance using only historical case data, while SARIMAX significantly improved predictive accuracy by incorporating climate covariates. In general, the LSTM model, particularly when combined with climate covariates, proved to be the most accurate machine learning model, despite being slower to train and predict. For long-term forecasts, Prophet with climate covariates was the most effective. Ensemble models, such as the combination of LSTM and ARIMA, showed substantial improvements over individual models.</p><p><strong>Conclusions: </strong>This study demonstrates the strengths and limitations of various methods for dengue forecasting across multiple timeframes. It highlights the best-performing statistical and machine learning methods, including their computational efficiency, underscoring the significance of machine learning techniques and the integration of climate covariates to improve forecasts. These findings offer valuable insights for public health officials, facilitating the development of dengue surveillance systems for more accurate forecasting and timely allocation of resources to mitigate dengue outbreaks.</p>\",\"PeriodicalId\":23311,\"journal\":{\"name\":\"Tropical Medicine and Health\",\"volume\":\"53 1\",\"pages\":\"52\"},\"PeriodicalIF\":3.6000,\"publicationDate\":\"2025-04-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11984044/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Tropical Medicine and Health\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1186/s41182-025-00723-7\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"TROPICAL MEDICINE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Tropical Medicine and Health","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s41182-025-00723-7","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"TROPICAL MEDICINE","Score":null,"Total":0}
Assessing dengue forecasting methods: a comparative study of statistical models and machine learning techniques in Rio de Janeiro, Brazil.
Background: Dengue is a mosquito-borne viral disease that poses a significant public health threat in tropical and subtropical regions worldwide. Accurate forecasting of dengue outbreaks is crucial for effective public health planning and intervention. This study aims to assess the predictive performance and computational efficiency of a number of statistical models and machine learning techniques for dengue forecasting, both with and without the inclusion of climate factors, to inform the design of dengue surveillance systems.
Methods: The dengue forecasting methods comparison in this study considers dengue cases in Rio de Janeiro, Brazil, as well as climate factors known to affect disease transmission. Employing a dynamic window approach, various statistical methods and machine learning techniques were used to generate weekly forecasts at several time horizons. Error measures, uncertainty intervals, and computational efficiency obtained with each method were compared. Statistical models considered were Autoregressive (AR), Moving Average (MA), Autoregressive Integrated Moving Average (ARIMA), and Exponential Smoothing State Space Model (ETS). In addition, models incorporating temperature and humidity as covariates, such as Vector Autoregression (VAR) and Seasonal ARIMAX (SARIMAX), were employed. Machine learning techniques evaluated were Random Forest, XGBoost, Support Vector Machine (SVM), Long-Short-Term Memory (LSTM) networks, and Prophet. Ensemble approaches that integrated the top performing methods were also considered. The evaluated methods also incorporated lagged climatic variables to account for delayed effects.
Results: Among the statistical models, ARIMA demonstrated the best performance using only historical case data, while SARIMAX significantly improved predictive accuracy by incorporating climate covariates. In general, the LSTM model, particularly when combined with climate covariates, proved to be the most accurate machine learning model, despite being slower to train and predict. For long-term forecasts, Prophet with climate covariates was the most effective. Ensemble models, such as the combination of LSTM and ARIMA, showed substantial improvements over individual models.
Conclusions: This study demonstrates the strengths and limitations of various methods for dengue forecasting across multiple timeframes. It highlights the best-performing statistical and machine learning methods, including their computational efficiency, underscoring the significance of machine learning techniques and the integration of climate covariates to improve forecasts. These findings offer valuable insights for public health officials, facilitating the development of dengue surveillance systems for more accurate forecasting and timely allocation of resources to mitigate dengue outbreaks.